Tag Archives: haskell

Relative time FRP

Conal seems to have discovered an essential change in the semantics of FRP. The change is to think in terms of relative time instead of absolute time (Einstein would approve). It really cleans a lot of things in the semantics up, makes the implementation local (which functional programming is very good at making efficient).

So, for example, instead of Behavior a = Time -> a, we have Behavior a = Time -> a :-). The difference is simply that on the left, “now” is whatever time you give it, whereas on the right, “now” is 0.

However, this change has far reaching consequences on how I think about FRP. The difficulties I have been trying to overcome in my implementations no longer make any sense. This is a good sign.

I have been gradually hacking on a “practical” implementation of FRP, i.e. putting aside the beauty of the implementation and doing whatever I can to make it efficient and easy to use. But man it’s awful. I have no certainty whatsoever than the 200 lines I have so far written are anywhere near correct other than they seem to work when I play with them. I don’t like that feeling.

I have the urge to revisit my precise and rigorous implementations I had been working on, which I gave up on because they didn’t efficiently express what I wanted them to. In particular, this relative time thing with comonads seems very simple. So I want to write it together with my “correct blocking thunks”, which are thunks that are technically blocking, but they never actually do because you have to prove that they won’t before you evaluate them :-)

I’m pretty excited. I’m hopeful that this new FRP is expressive and powerful in addition to its beauty. I won’t know that until I rewire my brain a little.

Screw laziness (w.r.t. Fran semantics)

Reactive is getting more mature, but still does not support “feedback” behaviors (a bug), and still does not support true first-class Behaviors (an essential limitation); meaning you cannot efficiently put a Behavior in a data structure and save it for later use. By observing the reactive mailing list, this limitation is proving unimportant; still plenty of things are possible. But I am a stubborn, hubristic person, and I simply insist on Fran semantics (Behavior & Future) with first class Behaviors.

I have tried and I have tried, and I am quite convinced that the Fran semantics are not implementable lazily and efficiently in Haskell. I think I have come up with a garbage collector extension that would allow it, but I’m not going to pursue it. Knowing that such an extension exists is important to me, however, as I see hardware such as The Reduceron playing a big role in the future. The IO monad is not going to fly in a reduceronian world.

My impatience for a functional reactive game and UI library is growing. I could have made one in Fregl, but I became convinced that Fregl was too inefficient, and that I required weak-reference-free laziness to make it efficient. In retrospect, however, Fregl was inefficient for many more reasons than simply its eagerness. I am thus throwing aside my laziness requirement, and am now setting out to make a very efficient eager FRP library with Fran semantics.

Perl hackers know the three virtues: laziness, impatience, and hubris. I find it amusing that the one I’m having to throw away in a lazy language is laziness itself.

Restricted Data Types

With some spare boredom I had this morning, I was skimming the section on functors in RWH. A little ways down they introduce the type:

data Eq a => Bar a = Bar a
instance Functor Bar where
    fmap f (Bar a) = Bar (f a)

And point out how the constraint does not allow a valid functor instance. But what if it did?

What I’m getting at is that nobody ever uses constraints on data definitions like that, because they’re basically useless (AFAIK, they just impose the constraint on each constructor and nothing more). Perhaps they could work in this way: the mention of the type Bar a implies the constraint Eq a. Thus:

fmap :: (a -> b) -> Bar a -> Bar b

The caller already provides the dictionaries Eq a and Eq b for you, because the types “Bar a” and “Bar b” come from his environment. I’m not sure what I mean by environment there.

The time you would be required to provide a dictionary would be if you used a Bar in the body of a function without mentioning it in the type signature.

If this makes sense (I’m not convinced it does in the least), it might be a nice way to solve the class restriction problem.

Udon Sketch #2

The core library of Udon is basically finished (see my previous post for a sketch of what udon is). It needs cleanup, documentation, and a quickcheck suite, but the functionality is there and working in my small tests. I have begun udon-shell, which is a way you can interact with udon objects from the command line. Mostly it’s just a way to test things, but may become an integral helper tool for the more complex tools to come.

My favorite thing about the Udon core is that there is nary a mention of IO anywhere. It is a purely functional core, the way I was told real software is written when I was a Haskell child. :-)

So, where to go from here? I have two projects in mind, which may just be one project. One of them is to make a version control system which blurs the idea of a “project”, so I don’t care whether it’s one or two. The beautiful thing about this library is that I can model these projects in a purely functional way (as long as I don’t reference any functions in my persistent data structures), and the persistence and distributedness comes for free.

Here is how I picture the “cabal file” for the package manager.

type PackageName = [String]  -- list of components
type ImportList = [(PackageName, PackageName)] -- use module Foo, name it Bar
data Package = Package {
    name :: String,  -- whatever you want, no thought involved, rename at will
    depends :: [(ExtRef Package, ImportList)],
    ...
  }

Which is the essence. That packages are referred two by reference rather than by name, and modules from a package are explicitly imported and can be renamed to avoid conflicts. I’m not worrying about how this interacts with the state of the ghc art; it’ll work out.

I’ll come back to this in a bit. First I want to talk about smart versioning.

Smart versioning has little to do with Udon, but as long as I’m making a version control system I might as well play with it (though I doubt I have the attention span to carry it to its full potential). There are more types of content than just text and directories. For example, one might write a “Haskell source” plugin which understands Haskell source code, rather than just seeing it as a collection of lines. Then you could tell the plugin about the refactors you do, such as renaming a function. If that change is merged with a version from before that change, any true references to that function will get renamed (but shadows of that name would remain).

Taking this further, say there were a “Haskell package” content type. Then if you broke backward compatibility between two versions, you would annotate how. If it’s something simple like a renaming, the package manager could automatically upgrade your project to work with the new version. If it’s something complex like the semantics of a function, it could see if you ever used that function and mark the locations in your code that needed to be updated.

Such patch abstractions, which are of course custom depending on the content type, could be the basis for the version contol system. I think Darcs does this kind of thing too, I’m not sure. But I use git’s “object identity” theory rather than Darcs’s patch theory.

So, given smart versioning patches, what is a proper package spec?

I want to support:

  • A package allowed to depend on multiple versions of another, with forward compatibility as I just described.
  • Splitting a package into two.
  • Merging two packages into one.
  • Renaming stuff (of course).

A package can vary by any combination of its dependencies. My hunch is that there should be a different object for each combination of dependencies (together with versions). But we don’t store all of them. Instead we store a root package, presumably depending on whatever precise versions of dependencies it was developed with, and then derive the package for other sets of dependencies using package diffs. To ensure robustness, a test suite should be included in the package to make sure the derivations were correct (though diffs ought to be pretty sure of themselves to begin with).

As scary as it might seem, this is already better than cabal. With cabal, the package makes some wild declaration about which versions of its dependencies it can support. And it can lie. There is no automated test that checks whether it told the truth, there is no way to inform a package with dependencies too liberal that the semantics of a function changed while its type stayed the same (thus introducing subtle bugs). Basically for cabal to work, you should have checked a package against every dependency you declare. But nobody does that.

Also, what of the case where a package needs modification between two versions of one of its dependencies. You have to resort to gross conditional compilation stuff.

Hmm, a troublesome case comes to mind. Suppose the case I just described happens, where foo depends on bar. foo-A works with bar-A, and foo-B works with bar-B. Now I find a bug in foo. I don’t want to have to fix it twice. So let’s say I fix it in foo-A. What happens to foo-B? How does it get the fix.

Well, this is a purely functional package manager, so I can’t really fix it in foo-A. So let’s say I fix foo-A and it becomes foo-A’. Then what I need to do to foo-B to fix it is apply the patch foo-A’/foo-A, to get foo-B’. Hmm, I guess that’s it. It will be tricky for a package author to keep this dependency information coherent for other packages to use.

Okay, I need to get to bed. Final sketch: packages do not refer to other versions of themselves. They stand alone, referring to their dependencies by exact version. Then edges (noodles?) are published, together with a possible “upgrade direction” (you always want the target of this edge if you can). Then some clever algorithm tries to upgrade the package and packages that depend on it. If one of the dependents fails, you can try it with an appropriate edge for the dependent (if one exists).

Hmm, a nice thing this solves is the ability to resolve (to some extent) that horrid dependency on multiple versions of the same package, by renaming one of the versions to something else. Then it is apparent and resolvable, at least, when you are treating two types from different versions as equal when they are not.

Sketch of Udon (Version Control/Packaging system)

I’m kind of tired, so don’t expect this post to be entirely coherent. I just wanted to write a little bit about the project I started recently, called Udon.

The idea came to me as the common ground between several problems I have seen recently. I will not go into the intricate details of why I consider them problems worth addressing. They are:

  • The namespace problem on Hackage, CPAN, and related module collections. Specifically, ridding ourselves of the rigid, arbitrary taxonomy of packages without the root namespace getting uselessly polluted.
  • The fact that no popular DVCS allows moving files between projects while preserving history in a sane way.
  • The desire for a purely functional operating system.

Incidentally, these are programming culture problems #2,#3, and #4 on my list (#1 is, of course, FRP). I’m glad I took a break from FRP so I could see this common ground.

Udon stands for “Universal Distributed Object Notation”, with a tip o’ the hat to JSON. And yes, it is a system for serializing objects. It is a gigantic distributed store for inter-referential stateless objects, and would support a huge distributed data structure with no single computer containing all of it.

I’ll get to the implementation details in a minute. I’ll first address how it helps to solve the problems above.

It will help in the implementation of a DVCS that does support moving files between projects, by more or less eliminating the idea of a project. The versioning is all in commit objects, which refer to changes in some tree data structure (representing the directory tree of the project). This is how git works. The difference is that there is no restriction about what tree this is; it could be a tree “from another project”. Just picture how you would write git as pure Haskell, never having to touch the filesystem or anything, everything being objects in memory.

This DVCS, or something similar, helps to solve the namespace problem. A project could refer to the modules it uses by identity rather than name. So instead of the cabal file saying I depend on “binary”, the cabal file (well, data structure in the new model) would say I depend on “that thing over there”, and then import the modules from that package using an advanced import mechanism, which allows renaming or qualifying modules, etc. to avoid collisions.

After conceiving the idea, I realized that it is essentially the filesystem for a typed, purely functional operating system, which is another one of my fancies.

Okay, that’s probably more than enough vagueness for you math types.

The implementation is based on a flawed concept; the same flawed concept that all the DVCSes use, that the hash of an object is a unique identifier for that object. Basically we pretend that hash collisions don’t exist. Yuck. But I could not think of any other way to do it. (This may be the thing that prevents me from going very far with this project, since I am getting in the habit of proving my code correct, and I will never be able to do that with this code since it is not correct).

There is a special type called ExtRef. ExtRef a is semantically just a value of type a. The kicker is that it might be on someone else’s machine, on your usb drive, or even lost forever because nobody kept its data around. It is represented as data ExtRef a = ExtRef Hash (Maybe a). Hash is the “unique identifier” for the object, and you always have that. You might have an actual copy of the object too, but if not, the Hash is there so you can find it if you need it. This representation, and the whole Hashing idea, is hidden from the high-level interface, of course.

You work with ExtRefs through the External monad. It’s a coroutine monad (free monad over i × (o -> •)) which reports which ExtRefs need their associated objects to continue the computation. The low-level interface will allow implementation of a variety of ways to handle this. The simplest one is a local store of objects to be queried. As things get more mature, that could be expanded into one that looks at various remote network stores (based on some origin map or something), or asks the user where to look, etc.

And my hope is that the little Udon framework I’ve described here will be sufficient to model the DVCS and the hackage package system as if everything were just pure data structures in memory. Of course it won’t be quite as convenient (you need to annotate your data structures with some ExtRefs at least), but it would at least encourage pure modeling of these “real world” systems.

FRP Rant

FRP is eating my life away!

Here’s the thing. I love FRP. I love the concept, so much so that I am having a very hard time writing anything without it. I want to make games, servers, UI libraries, with it. I want to make things! But I am stuck in research mode, where I am still looking for an FRP library in which to make these things.

Why am I stuck searching? Because I don’t like Reactive. It’s just difficult to use, and difficult to make easy to use. The semantics are unclean (not all of them, just some essential pieces like Event, integral, and stepper), it’s abstracted so high that it takes me forever just to figure out that it doesn’t quite make sense in this little corner, and it’s all coupled together so I can’t figure out how to fix it without rewriting the whole damn thing. And, according to Conal, it still has the dreaded spacetime leak (which can fortunately but rather annoyingly be circumvented as I described almost a year ago).

What am I searching for? Well, whatever ends up being beautiful and easy to use. I’ve got a bunch of leads varying in ease of use and possibility of implementation. The ones which are obviously implementable are basically Reactive. However, my dream would be a pull-implementation of Fregl (the source code now lives here), because Fregl was great! I actually made stuff with it! In particular, everything to do with Events just rolled off my fingertips. The waiter semantics and the monad instance were the power tools analogous to the axiom of choice, where previously careful, meticulous reasoning becomes tactless and obvious under the great sledgehammer. I want a sledgehammer; that’s why I program in a Turing-complete language rather than a total one.

Fregl is no beauty on the inside, however, and this impaired the efficiency of the programs implemented in it. The GHC optimizer is dumbstruck trying to optimize any code written using the library; because of the nature of the GHC garbage collector and weak references, far more work ended up being done than necessary. Basically it was a completely push implementation, and everything in Haskell is optimized for pull implementations, so performance was terrible.

What I want most for christmas is a Haskellian Fregl implementation.

data-memocombinators

Well, it’s not perfect, but it is filling a hole that has been mostly absent for far too long: I just uploaded the data-memocombinators library to hackage. The idea is to abstract away the manual memoization everybody keeps doing and pack them away in tidy little combinators.

For those out of loop of popular CS vernacular, “memoization” is a stupid name for “caching”. As a rather trivial example, take the naive definition of the fibonacci function:


fib :: Integer -> Integer
fib n = if n <= 1 then n else fib (n-1) + fib (n-2)

This runs in exponential time (a rather beautiful O(φn) where φ is the golden ratio). If we want to make this run in linear time without using our brains, we just cache (er, memoize) the intermediate results.

import qualified Data.MemoCombinators as Memo
fib :: Integer -> Integer
fib = Memo.integral fib'
    where
    fib' n = if n <= 1 then n else fib (n-1) + fib (n-2)

Note how the recursive call in fib' is to fib, not fib'.

There is also a library (heh, by Conal—who, contrary to what one might believe by analysis of interests, I am not :-), called MemoTrie, which does the same thing. However I don’t agree with the typeclass approach to this problem. We are concerned not simply whether or not you can memoize a particular type, but how. For example, for some problem you may know that you only want to memoize integers less than 1000, or that you want to use a fast array for those and a binary trie for the rest, or that you accept Eithers, and you want to memoize Lefts but not Rights. This is why, as the name suggests, data-memocombinators provides combinators for building nicely controlled memo tables.

I would like this library to grow, so if anybody has any suggestions or implementations for memo tables they use commonly, do send them along!

Laziness and the monad laws

Recently on haskell-cafe, the rather trivial module OneTuple was announced. I looked over the source and saw something that bugged me.

data OneTuple a = OneTuple { only :: a }

instance Functor OneTuple where
  fmap f (OneTuple x) = OneTuple (f x)

I never write functor instances this way. This is too strict; I would have written it like this instead:

instance Functor OneTuple where
  fmap f = OneTuple . f . only

Because it’s lazier. As I was composing a message explaining to the author why his instance was incorrect and why mine was correct, I realized something: his instance is correct and mine is incorrect! To see why, observe the functor laws:

  fmap id = id
  fmap (f.g) = fmap f . fmap g

My implementation satisfies the second law, but violates the first: fmap id ⊥ = OneTuple ⊥!

The same thing happens with the monad laws; I would have written the monad instance this way:

  instance Monad OneTuple where
    return = OneTuple
    m >>= f = f (only m)

But this violates the right unit law:

  m >>= return = m

Since ⊥ >>= return = return (only ⊥) = OneTuple ⊥ /= ⊥

So, in summary: most Haskell hackers know not to make their functions too strict. But we have to be careful not to make our functions too lazy either, and not just from an efficiency standpoint!

General Update

My brand new Acer laptop’s video card died a few days ago. The computer still worked, I just couldn’t see anything. So, while it’s being fixed, I’m back to using my old piece of junk which refuses AC power while turned on. It has an old battery, so I’m limited to about an hour of usage before it forces me to take a break. Needless to say, this is not enough time to get any significant coding done.

Coq is totally awesome! I proved Rice’s theorem for the SK calculus, which was a bit more satisfying than proving that every natural is either even or odd. My proof is pretty ugly, I can see a lot of ways I could clean it up, but I forgive myself; I’m a noob.

Coq is making me very excited for dependent types. For example, toward the end of the aforementioned proof, I started to define some combinators so I could construct the “impossible program”:

 Definition compose := S (K S) K.
 Lemma compose_really_does : forall f g x, compose f g x = f (g x).
  (* ... *)
 Qed.

I can envisage doing this kind of thing all the time in my regular programming practice. Write a little function, then prove that it does what I want it to. I’ve already begun exploring ways to use Coq for my every-day programming tasks, such as clean ways to hook into Haskell libraries with Extraction. Who knows how fruitful this endeavor will end up.

I have been making some interesting findings on the FRP front. Here’s an executive summary of each; I may or may not go into more detail in future posts.

(1) The semantics of behaviors are “functions of time”, but what the hell is time? In the real world, time has more structure than numbers. So I claim that in FRP what we’re actually constructing is causal transformations of time functions. A transformation R is causal if R(f)t = R(ft)t for all f, t, where the t subscript means “restricted to times before t”. As yet this has not led to any new implementation insights.

(2) FRP would be simple, easy to work with, and expressive even without its Event type, if only there were an efficient Monad instance for Behavior. For most of the cases where we would use Event, we substitute Behavior :. Future.

(3) Behavior has an efficient Monad instance, but it is not implementable in Haskell as it stands. It requires some support for linear types. Behavior itself is not linear per se, but each separate lexical instance is always accessed in a monotone pattern. So if multiple references were indicated by the compiler with split :: Behavior a -> (Behavior a, Behavior a), then you could destructively update each reference to the latest time it was accessed and avoid the expensive “fast-forwarding” necessary for the naive Monad instance. There are some other tricks, but that is the main idea.

(4) (What I’m currently exploring) It may be possible to get this efficiency without linear types by embedding the linearity in a monad, getting the “Behavior Monad Monad”. Ironically, the best candidate for doing this so far is the “Behavior :. Behavior Monad” (semantically). Whether or not it ends up working, I don’t think it will be very easy to work with.

Slicing open the belly of the IO monad in an alternate universe

I’ve been looking for a way to do the pieces of I/O that are well-defined in the framework of FRP. For example, fileContents "someFile" is a perfectly good function of time, why should we be forced to drop into the semantic fuzziness of the IO monad to get it?

Well, after a long talk with the Anygma folks about it, getting not very far in the practical world, I decided to do something else to quench my thirst for the time being since software needs to get written. I’m going to have the FRP program request actions by sending sinks to the top level, and then have the results of those requests come back via futures. So:

type Sink a = a -> IO ()
type Pipe = forall a. (Future a, Sink a)
data Stream a = a :> Stream a

liftFRP :: Sink a -> IO a -> IO ()
liftFRP sink a = a >>= sink

Okay, that’s nice, but how do we prevent uses of these things from becoming a tangled mess. Input always needs to communicate with output, and it seems like it would just be a pain to coordinate that. It turns out we can stick it in a monad:

newtype StreamReaderT v m a = SRT { runSRT :: StateT (Stream v) m a) }
    deriving (Functor, Monad, MonadTrans)
readNext = SRT $ do
    (x :> xs) <- get
    put xs
    return x

type IO' = StreamReaderT Pipe (WriterT (IO ()) Future)
-- using the IO () monoid with mappend = (>>)

liftFRP :: IO a -> IO' a
liftFRP io = do
    (fut,sink) <- readNext
    tell (io >>= sink)
    (lift.lift) fut

unliftFRP :: IO' a -> IO a
unliftFRP m = do
    stream <- makePipeStream  -- a bit of magic here
    ((fut,_),action) <- runWriterT (runStateT (runSRT m) stream)
    action
    return $! futVal fut

It looks like IO’ has the very same (lack of) semantics as IO. liftFRP and unliftFRP form an isomorphism, so we really can treat them as pulling apart the IO monad into something more versatile, and then putting it back together.

Also we get nice fun parallel version of liftFRP. I can’t decide if this should be the common one or not.

liftParFRP :: IO a -> IO' a
liftParFRP io = do
    (fut,sink) <- readNext
    tell (forkIO (io >>= sink))
    (lift.lift) fut

So using the liftFRP and unliftFRP, we are no longer “trapped in IO” as some folks like to say. We can weave in and out of using IO as we’re used to and using requests and futures when convenient. For example, it’s trivial to have a short dialog with the user via the console, and have the real time program ticking away all the while. Fun stuff!