Tag Archives: code

Life and Projects

I just lost my job. This happening was unexpected except to the deep knowledge of my subconscious, who, for the last week, has been practicing what to say when it happened. The reason is that my “level of productivity is not sufficiently high to justify [my] contract rate”. I agree with this, and although the whole thing has got me feeling a tad worthless, I am not beating myself up about it. The project was less familiar territory than I had expected, being of primarily object-oriented design (rather than functional — and yes “functional design” is a real thing). I have done a lot of OO programming in my day, but since I have started functional programming, my OO code never seems good enough. I now spend a good proportion of my time searching for cleaner, smaller, more obviously correct solutions, when the code that I am criticizing would have been just dandy 5 years ago.

Anyway I won’t blather on anymore justifying why I am still a good programmer nonetheless. This post is about the future.

In any case, I feel like the event woke me up. I was spending my 25 hours working on a project I didn’t really care about (despite it being fun and working with good people), a few hours a week on my own projects, and the rest of the time on who-knows-what. Chilling out, wasting time on the internet, watching netflix, getting high and playing the piano. It’s all a fine, relaxing lifestyle, but I am disappointed that I was spending so little time on my projects. So, I resolve to spend no less time working than I was before — at least 30 hours per week, all for my own goals.

For my own clarity, and for the interested, here is what I’m working on:

  • Evo (codename), a “serious” real-time strategy game written in Haskell, with my friend Max. The purpose of this game is twofold: (1) to be a ridiculously fun real-time strategy game (nothing good has come out of that genre for a few years) and (2) to prepare and exposit Haskell as a game development platform. Perhaps it is yet another attempt to solve my 6 year old thesis Game Programming is Too Hard.
  • Vatican, a lazy specializing interpreter, taking after the work of Michael Jonathan Thyer’s dissertation. This is largely a research project, in which I would be happy to get 1% of the performance of GHCi — just something to start from. Here is the repository.
  • An experimental programming language with a unique view of types: types are just properties (eg. “f :: Int -> Int” is another way of saying “for all x, if x is an Int then f x is an Int”). The language is essentially an untyped lambda calculus, but comes with a proof language in which the usual types and many more sorts of properties can be expressed and verified. There is some (mechanism as-yet undecided) macro facility that allows traditional type inference to be encoded as a macro in the language. There are a lot more ideas for this language floating around, but I am trying to stick to the principle “everything it does, it does perfectly”, so I’m holding off on new features until I know exactly where in the picture they appear.
  • I will consider blogging as officially one of my projects, to continue improving my writing. I am converging on a functional design methodology, my antipattern post hinting at its direction. But the feedback I’ve received on that post makes me realize that the argument and the details are not yet clear. I’d like to keep writing about it and exploring this methodology in order to one day pin it to rigorous principles.

30 hours a week of the above, or projects that the above evolve into. I don’t need to be a pedant: I know when I am and am not being productive.

Get up in the morning, work all day. –Philip Glass

Associative Alpha Blending

I recently revamped my graphics-drawingcombinators module to have a handsome denotational semantics. I may write a post going into full detail later, but the executive summary is that Image a = R2 → (Color, a). Due to TCM, I get semantics for Image’s Functor instance from this, and if Color is a monoid I get the semantics for the Applicative and Monoid instances as well. OpenGL combines colors by alpha blending, so I happily defined my monoid instance as alpha blending:

 mempty = Color 0 0 0 0
 mappend c@(Color _ _ _ a) c' = a*c + (1-a)*c'

It doesn’t take a genius to see that this violates two of the three monoid laws (the only one it satisfies is left unit: mempty `mappend` x = x). This is embarrassing. My new rigorous denotational semantics has a pretty awesome flaw.

To make this right, let’s redefine Color not as something which is drawn, but as a transformation on alpha-free colors. I do this because functions make great monoids: composition is always associative and identity is always a unit. But it’s not just any function, it’s an alpha blending function. So we say that f is a “Color” if there exists constants a and x such that f(c) = a x + (1-a) c, which I will write as f = [a; x]. Here a is a scalar and x is another alpha-free color like c. We would really like it if the composition of two Colors were also a Color. Let’s try:

f(g(c)) = [fa;fx]([ga;gx](c))
        = fa fx + (1 - fa) ([ga;gx](c))
        = fa fx + (1 - fa) (ga gx + (1 - ga) c)
        = fa fx + (1 - fa) ga gx + (1 - fa) (1 - ga) c
        = fa fx + (1 - fa) ga gx + (1 - fa - ga + fa ga) c
        = (fa fx + (1 - fa) ga gx) + (1 - (fa + ga - fa ga)) c

It looks like the “alpha” of our composition is (fa + ga – fa ga). Let’s call this a’. Now we have to get the left side of the addition into the form a’ r for some r. Let’s just hack it: multiply and divide1 by a’.

        = a' (fa fx + (1 - fa) ga gx) / a' + (1 - a') c

And then we can read off the answer:

[fa;fx] . [ga;gx] = [a' ; (fa fx + (1 - fa) ga gx) / a']
   where a' = fa + ga - fa ga

For the identity, we need:

a x + (1 - a) c = c

Which is satisfied for a = 0 with x = anything, so we can use [0,0].

Because we derived this from composition and identity, the laws must hold. The mathematically untrusting may wish to check this.

And there you have it: my new Color monoid which actually satisfies the laws. This is the way to compose alpha-blended colors — it reflects what happens if you draw one after the other, blending as you go. This is the math that pre-blends any segment of that sequence.

I should have known that OpenGL’s colors were transformations all along, since the color of an object that you see can depend on the color you used to clear the screen.


1 But what if (fa + ga – fa ga) = 0? Fortunately, the only place this happens when these variables are between 0 and 1 is fa = ga = 0, which means both f and g are the identity color.

Haskell Antipattern: Existential Typeclass

I have observed pattern of design emerging in various Haskell libraries, for example vty-ui, GlomeTrace, and XMonad. Here is an excerpt from vty-ui’s interface. The snippet is long but simple, and I need all of it to demonstrate my point.

class Widget w where
    render :: DisplayRegion -> w -> Image
    growHorizontal :: w -> Bool
    growVertical :: w -> Bool
    primaryAttribute :: w -> Attr
    withAttribute :: w -> Attr -> w
mkImage :: Widget a => Vty -> a -> IO Image
data AnyWidget = forall w. Widget w => AnyWidget w
data Text
data Box
data Fill
instance Widget AnyWidget
instance Widget Text
instance Widget Box
instance Widget Fill
text :: Attr -> String -> Text
hBox :: (Widget a, Widget b) => a -> b -> Box
vBox :: (Widget a, Widget b) => a -> b -> Box
hFill :: Attr -> Char -> Int -> Fill
vFill :: Attr -> Char -> Fill

This module models the concept of a GUI widget. It defines a class with a basis of operations on Widgets, and then instantiates a few data types into the class. To facilitate passing around arbitrary widgets instead of specific instances, it provides an AnyWidget class. This Any* class is the tell-tale sign of this antipattern.

When I was using the above interface, I found myself immediately wrapping the results of text, hbox, and the other combinators in AnyWidget. By throwing away the information of exactly what instance of Widget it was I lost nothing, all the same operations were available to me. Astute readers might notice that withAttribute is an exception, because it has a w to the left and to the right of an arrow, so if I pass it a Box, I know I will get a Box back. But I cannot use that knowledge for anything else, so it doesn’t buy me anything.

There is a simpler and more direct way to encode the same thing. Here is the interface I would have written to this library:

data Widget = Widget {
    render :: DisplayRegion -> Image,
    growHorizontal :: Bool,
    growVertical :: Bool,
    primaryAttribute :: Attr,
    withAttribute :: Attr -> Widget
}
mkImage :: Vty -> Widget -> IO Image
text :: Attr -> String -> Widget
hBox :: Widget -> Widget -> Widget
vBox :: Widget -> Widget -> Box
hFill :: Attr -> Char -> Int -> Widget
vFill :: Attr -> Char -> Widget

I know just from looking at the types that all the code in the module could be converted to this style. Every operation has the exact same assumptions as in the old version, and all the extra guarantees that the old one provided were useless because they were never used as assumptions. Plus, you don’t have to wrap every call to a combinator with AnyWidget. By exposing the constructor for the Widget type, it gives users the same opportunity to make custom Widgets as the class interface gave.

You might be thinking, “but what if somebody else made a widget type which did use the additional assumptions that the separated data types provided, for example,

data Window
instance Widget Window
window :: Widget w => (Int,Int) -> w -> Window
windowSize :: Window -> (Int,Int)
adjustSize :: (Int,Int) -> Window -> Window

“Haven’t we just excluded this possibility from the interface?” Turns out that we have not; just replace the instance with a function:

toWidget :: Window -> Widget

See how this is going? Classes become data types, instances become functions. I’m just manually “expanding” what GHC does for you. Of course, the expansion is smaller than the original.

I advocate a direct programming style in Haskell. Advanced type system features have their place, but plain old functions go a long, long way. Functions are the masters of reuse: when you use an advanced feature, you need a yet more advanced feature to abstract over it (think: classes < existential types < universally quantified constraints < unknown). But all you need to abstract over a function is another function.

Beliefs and the Unimpossibility of Functional Reactive Programming

Behaviors must be first class — this is a belief I have held for almost my entire tenure as an FRP researcher. The belief originated about two weeks after I learned of FRP, when I wrote my first FRP Arrow and in it tried to write a small game. The process was awkward; not as clean and natural as what I had dreamt of for the paradigm. The awkwardness arose when trying to wrangle a time-varying list of time-varying enemies. I concluded that Arrowized FRP (AFRP) was terrible at managing dynamic collections of behaviors, and therefore that an FRP system needed dynamic collections to be powerful enough to do nontrivial things.

Only half of my conclusion was justified. AFRP is indeed terrible at managing dynamic collections of behaviors. But I never questioned the latter half. It epitomizes what Conal Elliott calls “proof by lack of imagination”: since I couldn’t devise a way to write my game without dynamic collections, there must not be a way.

Conal and the rest of the FRP gang has been discussing the semantic junk in classic FRP. Clearly there is more (or rather, less) to an interactive Behavior than an arbitrary function between two non-interactive ones. This identified, my mind immediately jumped to Arrows. That is what they are for: function-like things that can’t do quite as much as regular functions.

And for some reason — perhaps it is my increased functional maturity — I feel no need for first-class behaviors anymore, no awkwardness for their lack. Programming with them doesn’t feel like the rest of functional programming. It feels too open-ended; i.e. I can say too much in the code without saying it in the type. I feel now like Arrows are a perfect fit for FRP. And fortunately, we already have an efficient AFRP implementation. It’s not a great implementation — there are semantic and operational details that could be improved, but it is something to start from, which is more than I had before.

I don’t blame myself for temporarily believing that we needed first-class behaviors. But I am disappointed in myself for taking so long to question that belief. FRP was my world. I was determined to understand it and to make it a practical alternative to Haskell’s IO. But I ended up blatantly ignoring half of the ongoing research because I didn’t believe it would work. This is not acceptable for a serious researcher.

The perfect way is only difficult for those who pick and choose.
Do not like, do not dislike; all will then be clear.
Make a hairbreadth difference, and Heaven and Earth are set apart.
If you want the truth to stand clearly before you, never be for or against.
The struggle between for and against is the mind’s worst disease.
— Zen master Sent-ts’an

New Year’s Resolutions: Produce, Believe

I bring you two personal experimental hypotheses for 2010.

I am a Haskell module author. Constituting my released modules are those ideas which resisted the least when I opened the text editor. But two of them, data-memocombinators and graphics-drawingcombinators have gained some popularity, and I am feeling rewarded having written them.

Most of my ideas functionalize pieces of Haskell programming that are currently imperatively-minded, as you can see with the two aforementioned. But FRP, a particularly greedy such idea has been stealing my tuits away from the others. I have envisaged functional command lines, package management, event handling, persistence, testing, and probably more that presently slip my mind. This brings me to my first new year’s resolution for 2010: Produce! It’s time to start coding these up instead of letting one very hard problem block the solution of moderately hard ones. Kicking off the year, I rewrote graphics-drawingcombinators to have a stronger semantic foundation, becoming version 1.0.0 (thanks Peaker and sinelaw for egging me on!).

My second resolution addresses an irrational fear I noticed a few days ago. The story begins with Hubris Arts, the game studio my friend Max and I started in July. We released our first game in November. We are in development on our second (codename “R4″) and prototyping our third (codename “Evo”). All of our development so far has been in C# with XNA, for a familiar reason: it was the path of least resistance. But as I prototype Evo I see all my beloved functional design patterns leaping out at me from between the lines of imperative blasphemy. So far the implementation seems a great fit for Haskell, but I am uneasy. Some part of me doesn’t believe that Haskell can do it. Haskell’s role in my life so far has been that of a language for beautiful koans and interesting experiments, but not for Real Software. But why not? My only evidence is my own fear.

Thus the second resolution: Believe in Haskell! I have decided to do Evo in Haskell. It is only by tackling Real Software that a language matures — it may not be ready now, but by doing it anyway, we will make it ready. If we can do it, that will be a wonderful success for the language, perhaps permanently parting us from C# von Neumann’s clutches. If we can’t, at least we’ll have reasons why not — ideas for things to improve — instead of unsupported uneasiness and unconfidence.

Reflections on a Holy Grail (FRP)

It has been two years, twenty-six days, six hours, twenty-four minutes and thirty seconds since my eyes caught the first glimpse of my mistress, functional reactive programming. Her simplicity, beauty, and power has captivated me since, but she has teased me — all of us lonely fools, really — by slipping through my fingers every time, just as I thought I was getting to know her.

A year ago I swore off the cold bitch. She was transforming my life into a single, unhealthy obsession. I needed time to think. Time — there she is again. Maybe by giving myself space, I thought, I could stumble upon a packet of wisdom and finally capture her heart, without really looking for it. But that sort of distance is a lie, my life continues to be about her. Distance only makes the heart grow fonder.

Last week I bumped into her at work. I am not proud of what I did. Forgive me for I am only human, ultimately incapable of suppressing the basal instinct that has ruled my kind for millions of years. That’s right, I implemented her, in the back seat of my Scala convertible. It would seem but a few hour fling on a whiteboard and an editor for the unprivy, but O how it intoxicated me! And it has not, as yet, ended in the heartbreak I have felt over and over again — she is quietly sitting in the back of my codebase, hardly a hundred lines, even doing some good if you can believe it. Maybe I am just fooling myself and this time will be no different. But, at the very least, I feel like I may have learned something.

~~

Perhaps FRP is not the sparkling Holy Grail of interactive functional programming that we have made it out to be. Maybe we cannot find an efficient, general implementation because there is none. Perhaps what we currently know about FRP is what there is to know — a book of heuristics, a collection of experiences and wisdom that tells us which of its features will and will not work together. FRP is not a single feature, you know, it really is a rather large design space, and quite a lot can be done with only pieces of it.

What I did at work was remarkably effective: I sketched how our server would read in FRP language on a whiteboard, and then I implemented a framework which supported exactly the operations I needed and no more. It took hardly two hours, and there is no spacetime leak, no GC nondeterminism. The situation in FRP seems to be something like that of modern theoretical physics: we have a few different theories that all work remarkably well for different kinds of problems, and if only they would work with each other, we would have a unified theory. But mysteriously, they just don’t seem to get along.

A unified theory is more than just something to please the eyes of pure functionalists — it is important to us because it would mean the fall of the IO monad regime. The book of heuristics I claim is FRP still needs a sin bin from which to retrieve the superglue that holds together the pieces of a complete application. So the school of semanticists cannot give a meaning to a complete program, only pieces of it. This makes us unable to prove things about a program, and unsure how to meaningfully compose multiple programs (eg. what kinds of things should and should not be allowed).

There are the quantum field theorists, attempting to bring gravity in to the regime of quantum mechanics; there are the string theorists, attempting to bring the rest of reality into the regime of general relativity. But I, as a physics layman, naively believe that neither of these approaches will end up being the workhorse of the eventual unified theory. I believe that said theory will come with a new way of looking at the world, rather than patching up the old ways. Maybe the expert in me can learn from the layman in me and try to solve the problem instead of clinging to the solution I think I desire.

What is the meaning of an interactive program?

Collision Detection with Enneatrees

Many of my games boil down to, at some level, a large collection of circles (or spheres) interacting with each other. I use circles for collision detection, and otherwise whenever I can for organization, because the results look more natural than using boxes. If you organize using boxes, your simulation will tend to “align” to the axes, which nature surely does not do.

So naturally, detecting when two of these circles are touching each other is one of the larger computational problems that my games face. And bewilderingly, I have not found anywhere a good solution that covers circles in a wide variety of sizes. That is what this article is about.

If your circles are all about the same size, and in particular fairly small in comparison to the needed accuracy of the algorithm, the obvious and de-facto solution is to use a quadtree. That is, start with the whole screen and divide it up into 4 regions, and do so recursively until you have a nice spatial arrangement of your objects, categorizing objects by which region their center lies in.

When you have a circle and want to see what other circles are touching it, you just look for other objects in the smallest region. Well, except you are not a point, you are a circle, so you might penetrate the edge of the region, in which case you need to look at the adjacent region as well. Well, except other circles might penetrate the edges of their regions, so you actually need to look at all adjacent regions. Well, except you might be really big, so you need to look in all the regions that you touch. Well, except other objects might be really big, so… you need to look in all the regions.

A quadtree has not bought us anything if circles can be arbitrary sizes; collision detection is still linear. You could say things like the number of very big circles is usually small (because they would collide away their competition), so you can put an upper bound on the size of circles in the tree then just do a linear check on the big ones. But that is an awful hack with a tuning parameter and doesn’t generalize. Shouldn’t the right solution work without hacks like that (for the fellow game developers in my audience: despite your instincts, the answer is yes :-).

I have worked with quadtrees before, and saw this problem a mile away for my game with circles of many different sizes. The first variation I tried was to put circles into regions only if they fit entirely. Circles on the boundaries would be added to the lowest branch that they could fit into. Then to check collisions on an object, you look at where it is in the tree, check all subregions, and then check all superregions (but not the superregions’ subregions). You will get small things in your vicinity on your way down, and large things on your way up. And it’s still logarithmic time… right?

Wrong. Suppose that every object in the world lay on the horizontal and vertical center lines. Even very small objects would accumulate in the top node, and no region subdividing would ever occur. This degenerate situation causes linear behavior, even though just by shifting the center of the region a little bit you get a sane subdividing. It’s not just a “let’s hope that doesn’t happen” thing: there are a lot of bad situations nearby it, too. Any object lying on those center lines will be checked when any other object is testing for its collisions, because it is on the way up. This did, in fact, become a major issue in my 10,000 particle simulation.

But if we could somehow salvage the original idea of this solution without the degenerate case, then we would get logarithmic behavior (when the field size is large relative to the radii of the particles).

To understand this solution, I would like you to turn your attention to one dimensional simulations. Here the previous solution would be subdividing the real line into a binary tree.

But the new solution divides intervals into not two, not four, but three child intervals. However, not in thirds like you would expect. (0,1) is divided into (0,1/2), (1/4, 3/4), and (1/2,1) — the intervals overlap. Any “circle” (in 1-D, so it’s another interval) with diameter less than 1/4 must fit into at least one of these subregions.

We use that property to organize the circles so that their size corresponds to their depth in the tree. If you know the diameter of the circle, you know exactly how deep it will be. So now when we are looking downward, we really are looking for small circles, and when we are looking upward we are looking for large ones, and there is no possibility of an accumulation of small irrelevant circles above us that are really far away.

However, on your way up, you also have to check back down again. If you are in the left portion of the (1/4,3/4) interval, you might intersect something from the right portion of the (0,1/2) interval, so you have to descend into it. But you can prune your search on the way down, cutting out “most” (all but a logarithm) of it. For example, you don’t need to check the interval (1/16,1/8) and its children even though you are checking its parent (0, 1/2) — it is too far away.

When you are deciding which region to put a circle in and you have a choice, i.e. the circle fits in multiple subregions, always pick the center one. When the circles are in motion, this gets rid of the degenerate case where a circle bounces back and forth across a deep, expensive boundary.

If you generalize to 2D, you end up cutting a region into 9 subregions (thus the name enneatree). The way I did it was to alternate cutting horizontally and vertically, so that I didn’t have to have 9-way if-else chains. That had more subtleties than I had expected. I have yet to find a straightforward, elegant implementation.

The algorithm seems to work well — I am able to support 5,000 densely packed circles in C# without optimizations.

The inspiration for this algorithm comes from the field of computable real numbers (infinite — not arbitrary, but infinite precision numbers). You run into trouble if you try to represent computable reals infinite streams of bits, because some interactions might be “non-local”; i.e. you might have to look infinitely far down a stream to find out whether a bit should be 0 or 1. Those guys solve the problem in the same way I did, by dividing intervals into 3 overlapping subintervals, and using this 3-symbol system as their number representation.

I see a glimmer of a connection between the two problems, but I can’t see it formally. Maybe it would become clearer if I considered infinite collections of infinitesimal objects needing collision detection.

IO-free splittable supply

I know it has been a while since I posted, and I’m sorry to say this isn’t a terribly deep or interesting one.

I like the value-supply package on hackage. It takes (essentially) an infinite list of things and arranges them into an infinite tree randomly… but a particularly useful kind of random. The first one you look at just happens to be the first element of the list, the second one you look at comes from the second element, etc.

This sort of magic isn’t useful for its magic, it’s just useful because you can use something like Int and not run out of bits too soon. All I ever use it for is unique labels in computations that would be made too strict by the state monad. As a contrived example of such a computation:

    allocateLabel :: State Label Label
    
    labelList :: State Label [Label]
    labelList = liftM2 (:) allocateLabel labelList

    bad :: State Label [(Label,Label)]
    bad = liftM2 zip labelList labelList

The problem is that we don’t know what label to start with on the second labelList in bad… we never finish using them up in the first one. But if all you care about is uniqueness of labels, rather than their order, this computation is easily definable as a function of a Supply from value-supply:

    labelList :: Supply Label -> [Label]
    labelList sup = let (a,b) = split2 sup in supplyValue a : labelList b

    good :: Supply Label -> [(Label,Label)]
    good sup = let (a,b) = split2 sup in zip (labelList a) (labelList b)

Great. Only problem is, we can only make Supplys from the IO monad. As it should be, since the supply uses a nasty form of nondeterminism to assign elements to its trees.

But, if all you care about is uniqueness, which is the case for me more often than not, you can squash IO out of the interface (not the implementation, unfortunately). Like this:

    runSupply :: (forall a. Eq a => Supply a -> b) -> b
    runSupply f = f (unsafePerformIO (newSupply 0 succ))

In my opinion, this is the best possible interface for such a thing. Value-supply is only to be used when all you care about is uniqueness, not ordering or the precise values. Well, this interface encodes that exactly. When you use this function, all you are allowed to care about is uniqueness, any other concern is hidden behind the type system. It is also impossible to confuse indices from different supplies.

And this is a perfectly well-behaved function. You could demonstrate that by giving a pure implementation of Supply [Bool] (which just traces the path to the given leaf from the root). The unsafe business is merely an optimization (and for some use cases, like labelList above, a very potent one).

Hooray. I get my favorite evil module, with all of its evil safely contained in a jar that they call a box for some reason.

Emphasizing Specialization

I have briefly—or perhaps not so briefly, my interests being unpredictable beasts—shifted the focus of my research to lazy specialization. This topic was discussed extensively, but not comprehensively, in Michael Jonathan Thyer’s dissertation, Lazy Specialization, and seems to have received little attention otherwise. This is unfortunate. I consider it a very important topic.

Lazy specialization is the idea that a functional program be incrementally optimized during its execution, in tandem with partial application. For example, consider the code:

    let g = f x in (g y, g y')

A lazy specializer would evaluate g as far as it could using only information about x, and then apply the resulting optimized function to y and y’.

If most of f’s work could be done using x alone, you could save a lot of time this way. That is the obvious appeal. The drawback on the other side is that the evaluation machinery of a lazy specializer is more involved than a whnf evaluator, so the whole system takes a speed hit as a result.

But so far as I can tell, the only such machines in existence are in a high-level, symbolic, proof-of-concept form, in which the hit is several orders of magnitude. Where’s the research getting down and dirty with this technology, seeing if we can bring it within a factor of 10 or less? Let me argue why we would want to pursue this seriously:

Here is a quote by Heinrich Apfelmus on laziness, at the end of his article on lazy sorting:

Well, it’s highly unlikely that algorithms get faster by introducing laziness. I mean, lazy evaluation means to evaluate only those things that are really needed and any good algorithm will be formulated in a way such that the unnecessary things have already been stripped off. But laziness allows to simplify and compose algorithms. Sometimes, seemingly different algorithms turn out to be two sides of the same coin when formulated with lazy evaluation. Isn’t it great that finding the k-th minimum is not only an adaption of quicksort but can readily be obtained from it by composing it with (!! k)?

I’d say that data structures are the sink for most of the creativity of lazy functional programmers. We spend lots of energy getting a data structure to represent exactly the things we need, sharing what needs to be shared. The key point about data structures is that they can be decomposed; you can peel off the root of a tree and leave yourself with only one of the branches, and the other branch will be garbage collected.

A lazy specializer promotes full-blown functions to the level of data structures. Functions can now be decomposed (by composing!) in the same way, sharing important pieces and forgetting unimportant ones. Observe this stream implementation:

    type Stream a = Nat -> a
    cons x xs z = case z of { Zero -> x; Succ z' -> xs z' }
    tail s z = s (Succ z)

Where Nat is the standard lazy natural type (constructors Zero and Succ). What happens when we evaluate tail (cons 1 (cons 2 R)), for some stream R (I don’t want to bother creating an infinite stream). I’ll use the HNF strategy that I discussed previously. Let’s watch:

   tail (cons 1 (cons 2 R))
   (\s z. s (z+1)) (cons 1 (cons 2 R))
   (\z. cons 1 (cons 2 R) (Succ z))
   (\z. (\z'. case z' of { Zero -> 1; Succ z' -> cons 2 R z' }) (Succ z))
   (\z. case Succ z of { Zero -> 1; Succ z' -> cons 2 R z' })
   (\z. cons 2 R z)
   ...

Where did the 1 go? Gone, collected by the garbage collector, just as if we had written Stream in the more conventional way. GHC would have held on to the 1 forever. Nat -> a behaves exactly like the Stream data structure. The structure of Nat has induced a structure for Nat -> a. That is a beautiful form of data structure composability.

Working in a language that uses a lazy specializer also comes with a very important mental advantage: abstracting a pattern using a higher-order function always has constant-time overhead. I.e. you will never make your program run faster by unfolding abstractions. This encourages the more composable higher-order style, providing a snappy comeback to any speed nazi. It also removes the encouragement to fiddle with the details of a function for more speed: beta-equivalent formulations will all turn into the same code in the end, so clarity is the only remaining variable, and can be maximized.

Ideas like GHC’s {-# RULES #-} are no longer fickle compiler constructions, but can now be made trustworthy optimizations. If the left side of a rule is in a certain form (a restricted sort of normal form), then its evaluation can be baked into the interpreter and will be applied as aggressively as possible. It is invariant to fiddling around with higher-order presentations and whatnot; they will be folded away and the rule can be seen. (But I’m not convinced RULES are necessary anymore; perhaps you can simply express the rules you want as a functional construction?)

The most convincing property of lazy specializers to me was the primary bench test for Thyer’s dissertation: the “tower of interpreters” test. This is the property that, if you write an interpreter for a language L, and in language L write an interpreter for M, and in M write an interpreter for N, etc., all the layers will be squashed away and you will end up with code that runs just as fast as it did in the base language (with a constant time overhead for eliminating the layers). You can design a language with all the bells and whistles you want, give it a semantics, and you may immediately use without a significant penalty. This is a huge abstraction win.

But there are a fair number of open problems in lazy specialization, more than just getting faster specialization machines. The explosion problem is tricky, in which, say, fix gradually becomes specialized to things like \f. f (f (f (f (f (f (f (fix f))))))); not useful, just eats up memory. The leap from complete laziness to optimal evaluation needs to be understood better. But in particular, I think we need a real LS to program with for a while, just to see what its dynamics are and to bring out its real-world problems.

I leave you with an addendum (which I snipped from the main argument), explaining my perceived connection between LS and FRP, my personal holy grail. All of my research links back to FRP somehow:


Finally, as if you didn’t know this was coming, there is a connection between lazy specializers and FRP, the holy grail of my functional programming life. It is a particular manifestation of the first point I made, designing streams as functions. I figure, all I have to do is design a data structure for time, with the right domain structure, perhaps something like:

   data Time = Zero | Next Double Time

And then define our friends:

   type Behavior a = Time -> a
   type Future a = (Time, a)

The implementation types exactly equal to their desired semantics, the most beautiful implementation I could ask for. Evaluation for a behavior b has a structure like:

   sample0 = b Zero
   b1 = b . Next 0.033
   sample1 = b1 Zero
   b2 = b1 . Next 0.033
   sample2 = b2 Zero
   ...

Each successive b is optimized, the unneeded bits at the front thrown off just as in the Stream example. The reason this can’t be encoded in regular Haskell is because the algebra for FRP includes a value of type Future (Behavior a), and we can’t see inside the Future because it is a functor, and thus arbitrary functions separate us from the enclosed Behavior, about which we must selectively forget things. However, the lazy specializer has no trouble looking through those functions, and will modify the behavior anyway, ridding us of the dreaded space leak.

On the By functions

Here’s a quick little note that has been bugging me for a while, and nobody wants to talk about it right now on IRC.

I think the By functions:

sortBy :: (a -> a -> Ordering) -> [a] -> [a]
maximumBy :: (a -> a -> Ordering) -> [a] -> a
groupBy :: (a -> a -> Bool) -> [a] -> [[a]]
nubBy :: (a -> a -> Bool) -> [a] -> [a]

Etc. should be replaced by On functions:

sortOn :: (Ord b) => (a -> b) -> [a] -> [a]
maximumOn :: (Ord b) => (a -> b) -> [a] -> a
groupOn :: (Eq b) => (a -> b) -> [a] -> [[a]]
nubOn :: (Eq b) => (a -> b) -> [a] -> [a]

My argument is: the functions provided to sortBy etc. have some preconditions. sortBy is not well-defined (or oughtn’t be) for functions which are not linear ordering functions; nubBy is shouldn’t be well-defined (to the dismay of some 31337s) for functions which do not encode an equivalence relation. But the folklore is that functions are typically “as total as possible”, so if it wants a function of some type, all I have to do is conjure a function of that type and my output will be something reasonable in terms of that function.

On the other hand, the folklore of typeclasses is that they typically come with laws. You need to prove — or at least think you know how to prove — some laws when you make a type an instance of a typeclass. The On functions use this obligation to encode their precondition. They are easier to use in a verified setting, too; there are a bunch of standard instances of Eq and Ord for which the laws are known to hold; map your data on to that in any way you like and the precondition is guaranteed.

Thoughts?