Monthly Archives: February 2008

Call-by-Future

I was reading about evaluation strategies for lambda calculus on Wikipedia, and one caught my eye: “call-by-future”. The idea behind this strategy is that you evaluate the function and its argument in parallel. It’s like lazy evaluation, except you start evaluating earlier. Call by future semantics are non-strict, so I figured I could coerce Haskell to doing it.

Template Haskell to the rescue again! I wrote a module which has this function:

  parApp :: (a -> b) -> (a -> b)
  parApp f x = x `par` f x

par is a special function which says to start evaluating its first argument and then return its second argument. Then it walks the syntax tree, replacing applications with applications of parApp, like:

  fib 0 = 0
  fib 1 = 1
  fib n = fib (n-1) + fib (n-2)

Becomes:

  fib 0 = 0
  fib 1 = 1
  fib n = parApp (parApp (+) (parApp fib (parApp (parApp (-) n) 1))
                                         (parApp (parApp (-) n) 2))

Pretty, ain’t it? :-p

For the above program (test.hs) computing fib 40, when compiled with -threaded -O2 I get the following results on a dual quad-core machine:

  #  command         time      speedup   incr.speedup
  ./test +RTS -N1  # 20.508s   1.00      N/A
  ./test +RTS -N2  # 18.445s   1.11      0.56
  ./test +RTS -N3  # 15.944s   1.29      0.77
  ./test +RTS -N4  # 12.690s   1.58      0.94
  ./test +RTS -N5  # 11.305s   1.81      0.90
  ./test +RTS -N6  #  9.663s   2.12      0.97
  ./test +RTS -N7  #  8.964s   2.29      0.92
  ./test +RTS -N8  #  8.541s   2.40      0.92

The number after -N is the number of hardware threads used, the speedup is the ratio of the original speed against the multicore speed (in an ideal world this would match the number of hardware threads), and the incremental speedup is (tn-1Nn-1)/(tnNn), i.e. the fraction of what we gained over the previous simulation versus what we should have in an ideal world. As long as this is near one, our time is decreasing linearly. As we can see, we pay a lot of overhead mostly at 2 and 3 processors, and after that there is little additional overhead. There is too little data here to see what the large-scale trend is, though.

Suppose that the incremental speedup goes to a constant p < 1. Then tn+1 = n/(n+1) * tn/p. Doing a little algebra, we see that the time levels off at n = p/(1-p). So, for example, if p were 0.92 as it looks like it’s going to be, then n = 11.5. That is, adding a 12th processor is not going to gain you anything. The long term goal for parallelization is to get solutions where p is very close to 1, because processors are going to start being cheap like hard drives, and I actually wouldn’t surprised if in 24 years, 64-cores were common (calculated using Moore’s law).

So, 8 processors, 2.4x speedup. We could certainly ask for better. But it ain’t bad considering you didn’t have to know anything about your program at all to get it there :-).

Spiffy GADT Thing

Take a look at this C++ code:

class Base {
  public:
    virtual ~Base() { }
};
class Derived : public Base {
  public:
    virtual void talk() { cout << "What's up?"; }
};
void test(Base* b) {
    if (dynamic_cast<Derived*>(b)) {
        b->talk();
    }
};

What’s wrong with it? Well, nothing really, it’s perfectly well-defined. Except it doesn’t compile. But at the time we get to the call b->talk() we know that b is actually of type Derived*, so we should be able to call talk() on it.

Now, I don’t expect languages to do this kind of magic. It’s perfectly reasonable for C++ to reject this code. But I was browsing the GHC source this morning and found a little hidden feature in the do notation. Haskell does do this!

Say we have this GADT:

data W :: * -> * where
    IntVal :: Int -> W Int
    AnyVal :: a -> W a

instance Show (W Int) where
    show _ = "W Int"

Then this code fails to compile:

test :: W a -> IO Int
test w = do
    print w
    IntVal i <- return w
    return i

Which makes sense, since w has type W a, and only W Int is an instance of Show. However:

test' :: W a -> IO Int
test' w = do
    IntVal i <- return w
    print w
    return i

Compiles just fine. That’s because if we got as far as the print w, then the pattern match succeeded, and thus w must be of type W Int as specified by the GADT. So imperative code in Haskell does narrowing, which most imperative languages don’t even do. Sweeet.

Dependent Types for Compile Time

I’ve been reading some of Simon Peyton Jones‘s papers on type families, and he mentions types representing Peano numerals, namely:

  data Zero
  data Succ a

And then proceeds to implement arithmetic like this:

  type family Add a b :: *
  type instance Add Zero x     = x
  type instance Add (Succ x) y = Succ (Add x y)

Such things are quite useful, and are done quite often using templates in C++ (taking advantage of C++’s turing-complete compile phase). This can be done in C++ as follows:

  struct Zero;
  template<class T> struct Succ;
  template<class T, class U> struct Add;
  template<class U>
    struct Add<Zero, U> { typedef U Result; };
  template<class T, class U>
    struct Add<Succ<T>, U> {
       typedef Succ< Add<T,U> > Result;
    };

One example is to give a type of fixed-size vectors, so that you can do, for example, a safe element-wise add. As primarily a creator of domain specific language-like libraries, the main use I see for such nice types is creating type systems for DSLs which are not “almost Haskell”.

However, in both C++ and Haskell, this is very clunky. There is no guarantee that Add<X,Y>::Result exists (in C++), likewise there is no guarantee that Add x y is defined (in Haskell). In addition, there is no guarantee that C++ will ever finish compiling; there is such a guarantee in Haskell, but that guarantee is made at the expense of not being able to do very much.

I want my compile time to be decidable. If I’m hacking on a huge project like mozilla, I don’t want to start a rebuild, go get a cup of coffee and for the usual 20 minute compile phase, come back half an hour later just to realize that I’ve written a compile-time infinite loop. You could see how that would be a major productivity hinderance. On the other hand, I want numbers at compile time!

Meanwhile, I’ve been learning about languages like Agda and Epigram, which use dependent types. These languages have extremely powerful type systems, at the expense of essentially running the whole program at compile time. The distinguishing feature, though, is that an implementation of any program in Agda or Epigram comes with a proof that it will halt. That is, they are not Turing complete, but they can still do most things.

Then it occurred to me: would it be possible to have a full dependently-typed language for the compile phase of a program, but not let the type system so far down into the semantics as Agda and Epigram do, so that you can have infinite loops at runtime (and just as importantly, don’t have to prove your program halts, since that is often pretty difficult). For type-level stuff, though, I’m okay with doing such proofs, since my type level stuff is infrastructure for my clients to prove their code is correct, so why shouldn’t I have to prove mine is?

The question is whether that is reasonable. Can I really get the desired power without “true” dependent types going all the way to the value level? Dependently typed language can use value-level functions in their types because the value-level functions have halting proofs. We might provide two sorts of functions, “total” functions and normal functions, where total functions must have a halting proof, with the benefit that you’re allowed to use them in your types. That doesn’t seem very clean, but it does seem practical.

Another practical issue with dependently typed languages is that many standard functions have more precise forms, and it would be easy for a smarter person to write a library that I couldn’t use because I’m not smart enough. Here is a passage of an Agda program demonstrating this:

  -- returns a list containing all naturals between a and b
  range : (a : Nat) -> (b : Nat) -> (a <= b) -> [ Nat ]
  range a b pf = map (\x -> a + x) (upto (subtract a b pf))

Look at the signature for that function. It’s a range function, it ought to take two arguments. But it looks like it takes 3, and what the heck is that last argument? It turns out that it’s a proof that the first argument is less than the second one. That wasn’t really necessary to implement range, I could have just said that if a > b then I could just return the empty list, but it felt cooler to require that.

That’s what feels unclean about this. There’s a time for extremely precise functions and there’s a time for less abstract ones, and mixing them is pretty tough (i.e. it’s hard to call a precise function from an imprecise one, because you don’t have the proofs you need). But it might be okay. Take Haskell’s compare function for example (after instantiating it to Int):

  data Ordering = LT | EQ | GT
  compare :: Int -> Int -> Ordering

In such a half-dependently typed language, we might instead imagine that function working like this:

  data Compare : Int -> Int -> Set where
    LT : (s < b) -> Compare a b
    EQ : Compare a a
    GT : (b < a) -> Compare a b
  compare : (a : Int) -> (b : Int) -> Compare a b

So comparing two numbers not only tells you whether one is less than the other, it proves it. But of course with every weakening of an assumption we have to strenghten a proof elsewhere, so now to implement an Ord instance you have to prove your comparison result (and if it’s implemented fully, also some transitivity and reflexivity properties), which can be really annoying.

Anyway, I think it would be good to have such a language (I’ll call it “practical dependent typing”) just to play with to see how these issues work themselves out. I think implementing such a language shouldn’t be too hard using Agda: basically I have to allow Agda’s type system to relax in certain places. I’m just afraid of hacking on Agda, because those guys are way too smart for me…

UPDATE: Woohoo. It seems that my work is already done for me, because in Agda a function which can’t be proven total just issues a warning, but will still execute.

Fregl: an introduction to FRP

I’ve been mentioning quite a bit about my latest FRP implementation. First things first: You can find it here. The implementation is called “Fregl”, for “Functional Reactive Game Library”, roughly. But I thought I would take a moment to describe it and some of the directions I plan on taking it. The first section introduces FRP at a high level, and my intent is to write it so that someone unfamiliar with Haskell can at least get an idea of what it is about. The second section is much more technical, and goes into some of the implementation details. (Hmm, don’t have time right now to write the second section. It’s a post for another day.)

The Surface: Signals and Events

There are two major types of objects that make up the core of Fregl: Signals and Events. People familiar with FRP will feel right at home with these. A Signal is a value that varies over time, like any ordinary variable in a typical imperative language (they are defined and used very differently, though). An Event is just something that happens at some time, and then returns a value when it does happen. Both Signal and Event are type constructors, kind of like templates in C++; they carry with them an additional type. For example: Signal Int is a value of type Int that varies over time; Event Vec2 is an event that, when it happens, returns a 2D vector.

There are some core events, for example: waitClickPos, the position of the next mouse click; waitTimestep, occurs when a simulation timestep happens and returns the amount of time since the last one. There are a handful of others, but for simplicity we’ll only use mouseClick which we’ll define like this:

  mouseClick :: Event Vec2                        -- type signature (optional)
  mouseClick = waitClickPos ButtonLeft MouseDown  -- implementation

I should stress that Events are first-class objects just like any other (unlike events in C#!). There is nothing stopping a function from taking some arguments and returning an Event. In fact, waitClickPos does exactly that! It takes a button and a state and returns an Event representing when that mouse button enters that state.

It turns out that most of the logic in an FRP program comes in the form of custom events. Events are composable: you make big, complex events out of smaller ones. Here are a few examples. Non-Haskell programmers, I encourage you to try to understand this code too, it is pretty easy (and I’ve intentionally written it in a more imperative style).

  -- wait for two mouse clicks and return the position of the second one
  secondClick = mouseClick >> mouseClick

  -- wait for two mouse clicks and return the vector from the first to the second
  clickVector = do {
      pos1 <- mouseClick;
      pos2 <- mouseClick;
      return (pos2 ^-^ pos1);  -- ^-^ is vector subtraction
  }

  -- wait for a given amount of time
  waitTime time = do {
    if time <= 0 then
        return ()
    else do {
        dt <- waitTimestep;
        waitTime (time - dt);
    }
  }

  -- if there's a mouse click in the next 10 seconds, return its position,
  -- otherwise return (0,0)    (oh, how unhaskellish)
  clickInTenSeconds = eitherEvent waitTenSeconds mouseClick
      where
      waitTenSeconds = waitTime 10 >> return (0,0)

I hope by the last example you can start to see the power of such composability.

So far we have seen how to define your own Events, but Events don’t do anything. They just define what it means for something to happen, not what to do when they actually do happen. So we have to introduce an “event handler”, right?

Wrong. We need only two primitives, neither of which I would consider an event handler:

  -- return an event which happens instantly, returning the current value in the signal
  readSig :: Signal a -> Event a

  -- The returned signal takes on the value of the first signal until the event happens
  -- (and returns a signal), and takes on the value of the second signal from that point
  -- on.
  untilEvent :: Signal a -> Event (Signal a) -> Event (Signal a)

Examples of using these are coming soon. First, and this the essential difference from classical game programming, note that there is no “signal assignment”. To determine what the value of a signal at a certain time should be, you look at the signal’s definition, and that tells you everything. The only things which can affect its value are the things it depends on, not the things that depend on it. This is one of the sorts of “referential transparency” that Fregl obeys. There will be another–read on.

Okay, so how do you actually use these things? Here is one example to get the feel for it. Again, non-Haskellers should be able to understand this code given a little thought. Oh, I should mention the backquote syntax: If I write foo `baz` bar, that just turns the function baz into an infix operator; i.e. it is equivalent to calling baz with the two arguments: baz foo bar.

  -- Takes an 'initial value' and returns an event which occurs instantly, returning
  -- a signal which counts the amount of time since the event occurred plus the
  -- initial value.
  time :: Double -> Event (Signal Double)
  time init = pure init `untilEvent` do {  -- "pure" constructs a constant signal
      dt <- waitTimestep;
      time (init + dt);
  }

Let’s take a moment to study that code. It says that the returned signal takes on the constant value ‘init’ ["pure init"] until ["untilEvent"] the next timestep ["waitTimestep"], at which point it behaves like time (init + dt). It is a recursive definition. (Nit for the leet: the last line should be time $! init + dt lest we should make a huge addition thunk.)

It returns the signal inside an Event which occurs instantly. This is very common of signals. You can think of Events not just as things that happen, but things that have a perception of time. That is, our time function here needs to know what the “current time” is for its definition, which is why it must be returned inside an Event. Don’t worry if this part doesn’t make sense yet.

Now we get to step way outside the realm of conventional programming and define some really powerful stuff. It turns out that it is possible to define the following function:

  -- Returns an event which happens instantly, and returns a signal which represents
  -- the integral of the given signal over time, starting from the time the event
  -- was fired.
  integral :: Signal Vec2 -> Event (Signal Vec2)

(The actual implementation works on more general types than just Vec2)

Using this, we can define the following program (very similar to examples/follow):

  -- we use the following predefined signal:
  mousePos :: Signal Vec2

  -- defines the position of a ball that follows the mouse around
  follower = mdo {
      let acceleration = mousePos ^-^ position ;
      velocity <- integral acceleration ;
      position <- integral velocity ;
      return position ;
  }

If nothing in the article has given you chills yet, this should. In the definition of follower, first we define the acceleration to be the target position (the mouse) minus the current position. What do you mean the current position, where is that defined? Look two lines down, there it is, defined as the integral of velocity! This is a circular definition of position, essentially asking the library to solve the equation position = ∫∫ (mousePos – position).

Then, using this definition, we can easily turn this into a program which draws a ball at that position using the (very limited so far) drawing primitives:

  main = runGameSDL $ do {
    ballPos <- follower
    return (Draw.translate <$> ballPos <*> pure Draw.circle)
  }

And there’s the whole program.

This is just the beginning, the bare bones of FRP. It’s all embedded inside idiomatic Haskell, so we can use Haskell’s top-notch abstraction abilities to write out programs. To give you a flavor, here’s the signature of a neat function:

  -- Takes an event as an argument.  When this event fires, it should return a
  -- a signal (representing perhaps the state of an actor) and another event
  -- (representing the death condition for this actor).  So the returned signal
  -- will be in the output list between the time the "creation event" fires
  -- and the time the "destruction event" fires.
  actorList :: Event (Signal a, Event ()) -> Event (Signal [a])

This function implements the idea of a list of active actors, something I’ve written over and over in C++, whose implementation is always spread out across my program (iterate through it once in the draw() routine, once in the step() routine, once in the events() routine). A less general variation of it appears in examples/shmup, and it only takes about 9 lines to implement, once and for all.

That’s it for the layman’s introduction. One of my major programming-evangelistic goals is to convince people that OO is not the best way to solve every problem. I hope that this introduction has shown you that there are some non-OO techniques which can be immensely powerful that the game community has not even begun to explore.

Here’s a teaser for next time. It turns out, not by any conscious plan of my own, just by coincidence, that Fregl’s implementation of these semantics admits a very nice parallel implementation. In theory, it scales linearly with the number of processors automatically (i.e. the game programmer need not put in any extra work), which is a big deal if I do say so myself. I’ll be performing a practical test on one of CU’s 8-processor machines in a few days.

Sleepless Night

It’s one of those nights again. A night I am tired enough to sleep, but I just don’t want to for a reason that evades me. Instead, I’m staying up watching bad movies and wasting time. One way, I thought, to waste time would be to write some nonsense in my blog.

My life is pretty busy right now, in a way that would be judged by someone else to be completely the opposite. When I’m not staring at my computer, I’m playing the piano. But inside I know that the aforementioned someone else doesn’t know what he is talking about. Here’s some of the interesting stuff I’m working on:

I have a very powerful incarnation of FRP with well-defined (but occasionally spooky) semantics. The world is broken up into signals (values that vary over time) and events (somewhat spooky). It has a nice referential transparency-like property which I painstakingly maintained in my design (which was not entirely easy since Event is isomorphic to IO (or rather, a transformer thereof)). It was very tempting to allow things like writing files and allocating names for objects inside events. But I thought it through, and the property I want to maintain disallows those things. Anyway, I programmed a little shoot-em-up in the FRP library I wrote, and it’s quite encouraging. I still am not sure quite how to think in FRP, but the abstraction abilities are amazing (as expected). The only problem is that it is quite slow at the moment, but there are abundant opportunities for optimization in the library, which I just haven’t had the inspiration to implement.

I’m also working on several games. The first is the linguistic magic game that I’ve been talking about here on and off for several years. I have the simulation, and even a rough spec for the language (which is reminiscent of the aborted child of Esperanto and Finnish). I can’t say much about it without going into a lot of detail, so I’ll just put that off until later.

The other game I’m working on is a music and rhythm game, which has hopes of transcending the sight-reading trend. I have tried to design an improvisation game before, and it was never too promising. But this one is promising, and I have it basically fully specced out in my head. I’m going back the other direction towards sightreading, but not all the way. It’s a hybrid of sightreading and improvisation, where the game is serving as the conductor (remember my founding idea for SNW?) and you, the player, are serving as the musician. As an example flow, the computer would give you a drum beat and might start you off with a Guitar Hero style sightreading thing as a bassline; i.e. “play this line verbatim”. You repeat that 6 times, and then the computer takes over: the bassline continues, but your fingers do not. Then maybe it gives you a line to play, but it just tells you the notes and the order, not the rhythm. You are in charge of inventing the rhythm for that line. And so on. There are soloesque line sections where it gives you a set of notes, and you just have to play all the notes in the set, but the order and rhythm is up to you. There are invent-your-own-bassline sections, where you play any bassline you like, but then you’ve got to repeat it (and keep coming back to it throughout the song). The key, tempo, drum part, and song structure are dictated by configuration files, but the exact incarnation of the song is up to the player.

I’m also learning Hungarian Rhapsody no. 2 by Liszt on the piano, which is the second most rediculously difficult song I’ve ever attempted (the first being—*snicker*—Rachmaninoff’s 3rd Piano Concerto, which, needless to say, I did not get very far into).