Compact data types

Join me, fellow readers, as I learn some topology by pretending I know something about it.

I recently uploaded Martin Escardó’s infinite search monad to hackage, under the name infinite-search. It is a delightful little thing to play with, and has inspired some deeper thoughts about the connection between computation and topology.

Quick refresher on topology: a topological space is a set X together with a collection of subsets of X, called the open subsets. The open subsets must include the empty set and the entirety of X itself, and must be closed under arbitrary unions and finite intersections.

So for example, you can generate the standard topology on the real line by taking all the open intervals (a,b) and closing them under arbitrary unions. So for example, the set $\cup \{ (a-\frac{1}{4},a+\frac{1}{4}) | a \in \mathbb{Z} \}$, which is ball of radius 1/2 around every integer, is open. No solitary points are open; every point must have an arbitrarily small cloud of nearby points.

As another example, there is a topology on the natural numbers called the discrete topology, in which you consider every singleton set to be an open set and take arbitrary unions over those. In this topology, every subset of the naturals is open.

A space is compact if “every open cover has a finite subcover”. That means for any collection S of open sets whose union is the entire space, there is a finite collection $S' \subseteq S$ whose union is also the entire space. Obviously all finite topologies are compact, because there are no infinite collections to begin with.

So the discrete topology on the naturals I talked about is not compact. Consider this cover: $\{ \{n\} | n \in \mathbb{N} \}$; just all the singleton open sets. This obviously has no finite subcollection which also unions to the entire set of naturals (if you leave out even one of these sets it fails to be a cover).

Okay, on to computation. Roughly what Martin Escardó’s work says is that compact spaces can be exhaustively searched. That is, if S is a compact space, then given a total function p :: S -> Bool, we can either compute an element of S satisfying p or compute that there is no such element. If we could figure out a way to form topological spaces on our data types, then we can know whether they are exhaustively searchable (and construct algorithms to search them).

A domain is the set of all elements of a data type, including partial ones (with ⊥s in them). So the domain for Bool is { ⊥, True, False }, and the domain for Either () is { ⊥, Left ⊥, Left (), Right ⊥, Right () }. To construct a topology (the Scott topology?) for the type, generate it by the sets of all compatible elements with some finite element of the domain. So the open sets for our Either () would be { { Left (), Right () }, { Left () }, { Right () } }. We also toss in the empty set, since it is the union of zero of these.

Earlier we saw that the naturals were not compact. This is a good sign, because if we could exhaustively search the naturals, then we could write an algorithm to eg. decide the twin-prime conjecture. However, I will now show that the lazy naturals are compact.

The lazy naturals are the data type:

data Nat = Zero | Succ Nat


What is the set of elements of this domain, and what is the topology? The set of elements is just the regular naturals plus the special one let infinity = Succ infinity. The topology has all the regular finite naturals as singleton sets, just like the discrete topology, but it also has subsets generated by { ⊥, Succ ⊥, Succ (Succ ⊥), … }. That is, sets of all of them greater than some natural including infinity. The key is that the singleton { infinity } itself is not open, it only comes from one of these infinite sets.

Now let’s say you have a cover for Nat. It includes infinity, so it must include the a set { n | n > k for some k }. So we pick that one, and then we pick however many sets we need to cover all the naturals less than k; there will be at most k of them. So we have constructed a finite subcover.

So regular naturals are not compact, but Nat is. So regular naturals are not searchable, but Nat is. But we got Nat only by adding to the regular naturals; how did we make it searchable in the process?

There was a little snag in the definition of searchable above: we needed a total predicate. Many functions which we would not be able to search on the regular naturals would get into an infinite loop if we passed the function infinity, and that would mean the predicate is not total. Intuitively, that means we can only search on predicates which are bounded in the size of the natural; which stop looking at some point.

I conjecture that every ADT in Haskell (without strict fields) is compact. It is pretty easy to write combinators which construct Data.Searchable Sets for any ADT; whether they are proper compact Sets (i.e. the search function is in fact total) is another issue.

So that’s it. We can make a compact set of Naturals and exhaustively search them. For fun, here’s a little Haskell implementation demonstrating this:

import Data.Searchable
data Nat = Zero | Succ Nat

nat :: Set Nat
nat = singleton Zero union fmap Succ nat

-- decidable equality on functions from Nat!
eqnatf :: (Eq a) => (Nat -> a) -> (Nat -> a) -> Bool
eqnatf f g = forevery nat \$ \n -> f n == g n


Go ahead, write a Num instance for Nat and play around. Just remember that your predicates need to halt even when given infinity, otherwise they are not total. In particular, this means that you can’t use eqnatf on Nat -> Nat, because equality on Nat itself is not total (infinity == infinity loops).

Advertisements

Relative time FRP

Conal seems to have discovered an essential change in the semantics of FRP. The change is to think in terms of relative time instead of absolute time (Einstein would approve). It really cleans a lot of things in the semantics up, makes the implementation local (which functional programming is very good at making efficient).

So, for example, instead of Behavior a = Time -> a, we have Behavior a = Time -> a :-). The difference is simply that on the left, “now” is whatever time you give it, whereas on the right, “now” is 0.

However, this change has far reaching consequences on how I think about FRP. The difficulties I have been trying to overcome in my implementations no longer make any sense. This is a good sign.

I have been gradually hacking on a “practical” implementation of FRP, i.e. putting aside the beauty of the implementation and doing whatever I can to make it efficient and easy to use. But man it’s awful. I have no certainty whatsoever than the 200 lines I have so far written are anywhere near correct other than they seem to work when I play with them. I don’t like that feeling.

I have the urge to revisit my precise and rigorous implementations I had been working on, which I gave up on because they didn’t efficiently express what I wanted them to. In particular, this relative time thing with comonads seems very simple. So I want to write it together with my “correct blocking thunks”, which are thunks that are technically blocking, but they never actually do because you have to prove that they won’t before you evaluate them :-)

I’m pretty excited. I’m hopeful that this new FRP is expressive and powerful in addition to its beauty. I won’t know that until I rewire my brain a little.

Screw laziness (w.r.t. Fran semantics)

Reactive is getting more mature, but still does not support “feedback” behaviors (a bug), and still does not support true first-class Behaviors (an essential limitation); meaning you cannot efficiently put a Behavior in a data structure and save it for later use. By observing the reactive mailing list, this limitation is proving unimportant; still plenty of things are possible. But I am a stubborn, hubristic person, and I simply insist on Fran semantics (Behavior & Future) with first class Behaviors.

I have tried and I have tried, and I am quite convinced that the Fran semantics are not implementable lazily and efficiently in Haskell. I think I have come up with a garbage collector extension that would allow it, but I’m not going to pursue it. Knowing that such an extension exists is important to me, however, as I see hardware such as The Reduceron playing a big role in the future. The IO monad is not going to fly in a reduceronian world.

My impatience for a functional reactive game and UI library is growing. I could have made one in Fregl, but I became convinced that Fregl was too inefficient, and that I required weak-reference-free laziness to make it efficient. In retrospect, however, Fregl was inefficient for many more reasons than simply its eagerness. I am thus throwing aside my laziness requirement, and am now setting out to make a very efficient eager FRP library with Fran semantics.

Perl hackers know the three virtues: laziness, impatience, and hubris. I find it amusing that the one I’m having to throw away in a lazy language is laziness itself.

Restricted Data Types

With some spare boredom I had this morning, I was skimming the section on functors in RWH. A little ways down they introduce the type:

data Eq a => Bar a = Bar a
instance Functor Bar where
fmap f (Bar a) = Bar (f a)


And point out how the constraint does not allow a valid functor instance. But what if it did?

What I’m getting at is that nobody ever uses constraints on data definitions like that, because they’re basically useless (AFAIK, they just impose the constraint on each constructor and nothing more). Perhaps they could work in this way: the mention of the type Bar a implies the constraint Eq a. Thus:

fmap :: (a -> b) -> Bar a -> Bar b


The caller already provides the dictionaries Eq a and Eq b for you, because the types “Bar a” and “Bar b” come from his environment. I’m not sure what I mean by environment there.

The time you would be required to provide a dictionary would be if you used a Bar in the body of a function without mentioning it in the type signature.

If this makes sense (I’m not convinced it does in the least), it might be a nice way to solve the class restriction problem.

Udon Sketch #2

The core library of Udon is basically finished (see my previous post for a sketch of what udon is). It needs cleanup, documentation, and a quickcheck suite, but the functionality is there and working in my small tests. I have begun udon-shell, which is a way you can interact with udon objects from the command line. Mostly it’s just a way to test things, but may become an integral helper tool for the more complex tools to come.

My favorite thing about the Udon core is that there is nary a mention of IO anywhere. It is a purely functional core, the way I was told real software is written when I was a Haskell child. :-)

So, where to go from here? I have two projects in mind, which may just be one project. One of them is to make a version control system which blurs the idea of a “project”, so I don’t care whether it’s one or two. The beautiful thing about this library is that I can model these projects in a purely functional way (as long as I don’t reference any functions in my persistent data structures), and the persistence and distributedness comes for free.

Here is how I picture the “cabal file” for the package manager.

type PackageName = [String]  -- list of components
type ImportList = [(PackageName, PackageName)] -- use module Foo, name it Bar
data Package = Package {
name :: String,  -- whatever you want, no thought involved, rename at will
depends :: [(ExtRef Package, ImportList)],
...
}


Which is the essence. That packages are referred two by reference rather than by name, and modules from a package are explicitly imported and can be renamed to avoid conflicts. I’m not worrying about how this interacts with the state of the ghc art; it’ll work out.

I’ll come back to this in a bit. First I want to talk about smart versioning.

Smart versioning has little to do with Udon, but as long as I’m making a version control system I might as well play with it (though I doubt I have the attention span to carry it to its full potential). There are more types of content than just text and directories. For example, one might write a “Haskell source” plugin which understands Haskell source code, rather than just seeing it as a collection of lines. Then you could tell the plugin about the refactors you do, such as renaming a function. If that change is merged with a version from before that change, any true references to that function will get renamed (but shadows of that name would remain).

Taking this further, say there were a “Haskell package” content type. Then if you broke backward compatibility between two versions, you would annotate how. If it’s something simple like a renaming, the package manager could automatically upgrade your project to work with the new version. If it’s something complex like the semantics of a function, it could see if you ever used that function and mark the locations in your code that needed to be updated.

Such patch abstractions, which are of course custom depending on the content type, could be the basis for the version contol system. I think Darcs does this kind of thing too, I’m not sure. But I use git’s “object identity” theory rather than Darcs’s patch theory.

So, given smart versioning patches, what is a proper package spec?

I want to support:

• A package allowed to depend on multiple versions of another, with forward compatibility as I just described.
• Splitting a package into two.
• Merging two packages into one.
• Renaming stuff (of course).

A package can vary by any combination of its dependencies. My hunch is that there should be a different object for each combination of dependencies (together with versions). But we don’t store all of them. Instead we store a root package, presumably depending on whatever precise versions of dependencies it was developed with, and then derive the package for other sets of dependencies using package diffs. To ensure robustness, a test suite should be included in the package to make sure the derivations were correct (though diffs ought to be pretty sure of themselves to begin with).

As scary as it might seem, this is already better than cabal. With cabal, the package makes some wild declaration about which versions of its dependencies it can support. And it can lie. There is no automated test that checks whether it told the truth, there is no way to inform a package with dependencies too liberal that the semantics of a function changed while its type stayed the same (thus introducing subtle bugs). Basically for cabal to work, you should have checked a package against every dependency you declare. But nobody does that.

Also, what of the case where a package needs modification between two versions of one of its dependencies. You have to resort to gross conditional compilation stuff.

Hmm, a troublesome case comes to mind. Suppose the case I just described happens, where foo depends on bar. foo-A works with bar-A, and foo-B works with bar-B. Now I find a bug in foo. I don’t want to have to fix it twice. So let’s say I fix it in foo-A. What happens to foo-B? How does it get the fix.

Well, this is a purely functional package manager, so I can’t really fix it in foo-A. So let’s say I fix foo-A and it becomes foo-A’. Then what I need to do to foo-B to fix it is apply the patch foo-A’/foo-A, to get foo-B’. Hmm, I guess that’s it. It will be tricky for a package author to keep this dependency information coherent for other packages to use.

Okay, I need to get to bed. Final sketch: packages do not refer to other versions of themselves. They stand alone, referring to their dependencies by exact version. Then edges (noodles?) are published, together with a possible “upgrade direction” (you always want the target of this edge if you can). Then some clever algorithm tries to upgrade the package and packages that depend on it. If one of the dependents fails, you can try it with an appropriate edge for the dependent (if one exists).

Hmm, a nice thing this solves is the ability to resolve (to some extent) that horrid dependency on multiple versions of the same package, by renaming one of the versions to something else. Then it is apparent and resolvable, at least, when you are treating two types from different versions as equal when they are not.

Sketch of Udon (Version Control/Packaging system)

I’m kind of tired, so don’t expect this post to be entirely coherent. I just wanted to write a little bit about the project I started recently, called Udon.

The idea came to me as the common ground between several problems I have seen recently. I will not go into the intricate details of why I consider them problems worth addressing. They are:

• The namespace problem on Hackage, CPAN, and related module collections. Specifically, ridding ourselves of the rigid, arbitrary taxonomy of packages without the root namespace getting uselessly polluted.
• The fact that no popular DVCS allows moving files between projects while preserving history in a sane way.
• The desire for a purely functional operating system.

Incidentally, these are programming culture problems #2,#3, and #4 on my list (#1 is, of course, FRP). I’m glad I took a break from FRP so I could see this common ground.

Udon stands for “Universal Distributed Object Notation”, with a tip o’ the hat to JSON. And yes, it is a system for serializing objects. It is a gigantic distributed store for inter-referential stateless objects, and would support a huge distributed data structure with no single computer containing all of it.

I’ll get to the implementation details in a minute. I’ll first address how it helps to solve the problems above.

It will help in the implementation of a DVCS that does support moving files between projects, by more or less eliminating the idea of a project. The versioning is all in commit objects, which refer to changes in some tree data structure (representing the directory tree of the project). This is how git works. The difference is that there is no restriction about what tree this is; it could be a tree “from another project”. Just picture how you would write git as pure Haskell, never having to touch the filesystem or anything, everything being objects in memory.

This DVCS, or something similar, helps to solve the namespace problem. A project could refer to the modules it uses by identity rather than name. So instead of the cabal file saying I depend on “binary”, the cabal file (well, data structure in the new model) would say I depend on “that thing over there”, and then import the modules from that package using an advanced import mechanism, which allows renaming or qualifying modules, etc. to avoid collisions.

After conceiving the idea, I realized that it is essentially the filesystem for a typed, purely functional operating system, which is another one of my fancies.

Okay, that’s probably more than enough vagueness for you math types.

The implementation is based on a flawed concept; the same flawed concept that all the DVCSes use, that the hash of an object is a unique identifier for that object. Basically we pretend that hash collisions don’t exist. Yuck. But I could not think of any other way to do it. (This may be the thing that prevents me from going very far with this project, since I am getting in the habit of proving my code correct, and I will never be able to do that with this code since it is not correct).

There is a special type called ExtRef. ExtRef a is semantically just a value of type a. The kicker is that it might be on someone else’s machine, on your usb drive, or even lost forever because nobody kept its data around. It is represented as data ExtRef a = ExtRef Hash (Maybe a). Hash is the “unique identifier” for the object, and you always have that. You might have an actual copy of the object too, but if not, the Hash is there so you can find it if you need it. This representation, and the whole Hashing idea, is hidden from the high-level interface, of course.

You work with ExtRefs through the External monad. It’s a coroutine monad (free monad over i × (o -> •)) which reports which ExtRefs need their associated objects to continue the computation. The low-level interface will allow implementation of a variety of ways to handle this. The simplest one is a local store of objects to be queried. As things get more mature, that could be expanded into one that looks at various remote network stores (based on some origin map or something), or asks the user where to look, etc.

And my hope is that the little Udon framework I’ve described here will be sufficient to model the DVCS and the hackage package system as if everything were just pure data structures in memory. Of course it won’t be quite as convenient (you need to annotate your data structures with some ExtRefs at least), but it would at least encourage pure modeling of these “real world” systems.

FRP Rant

FRP is eating my life away!

Here’s the thing. I love FRP. I love the concept, so much so that I am having a very hard time writing anything without it. I want to make games, servers, UI libraries, with it. I want to make things! But I am stuck in research mode, where I am still looking for an FRP library in which to make these things.

Why am I stuck searching? Because I don’t like Reactive. It’s just difficult to use, and difficult to make easy to use. The semantics are unclean (not all of them, just some essential pieces like Event, integral, and stepper), it’s abstracted so high that it takes me forever just to figure out that it doesn’t quite make sense in this little corner, and it’s all coupled together so I can’t figure out how to fix it without rewriting the whole damn thing. And, according to Conal, it still has the dreaded spacetime leak (which can fortunately but rather annoyingly be circumvented as I described almost a year ago).

What am I searching for? Well, whatever ends up being beautiful and easy to use. I’ve got a bunch of leads varying in ease of use and possibility of implementation. The ones which are obviously implementable are basically Reactive. However, my dream would be a pull-implementation of Fregl (the source code now lives here), because Fregl was great! I actually made stuff with it! In particular, everything to do with Events just rolled off my fingertips. The waiter semantics and the monad instance were the power tools analogous to the axiom of choice, where previously careful, meticulous reasoning becomes tactless and obvious under the great sledgehammer. I want a sledgehammer; that’s why I program in a Turing-complete language rather than a total one.

Fregl is no beauty on the inside, however, and this impaired the efficiency of the programs implemented in it. The GHC optimizer is dumbstruck trying to optimize any code written using the library; because of the nature of the GHC garbage collector and weak references, far more work ended up being done than necessary. Basically it was a completely push implementation, and everything in Haskell is optimized for pull implementations, so performance was terrible.

What I want most for christmas is a Haskellian Fregl implementation.

data-memocombinators

Well, it’s not perfect, but it is filling a hole that has been mostly absent for far too long: I just uploaded the data-memocombinators library to hackage. The idea is to abstract away the manual memoization everybody keeps doing and pack them away in tidy little combinators.

For those out of loop of popular CS vernacular, “memoization” is a stupid name for “caching”. As a rather trivial example, take the naive definition of the fibonacci function:

 fib :: Integer -> Integer fib n = if n <= 1 then n else fib (n-1) + fib (n-2) 

This runs in exponential time (a rather beautiful O(φn) where φ is the golden ratio). If we want to make this run in linear time without using our brains, we just cache (er, memoize) the intermediate results.

import qualified Data.MemoCombinators as Memo
fib :: Integer -> Integer
fib = Memo.integral fib'
where
fib' n = if n <= 1 then n else fib (n-1) + fib (n-2)


Note how the recursive call in fib' is to fib, not fib'.

There is also a library (heh, by Conal—who, contrary to what one might believe by analysis of interests, I am not :-), called MemoTrie, which does the same thing. However I don’t agree with the typeclass approach to this problem. We are concerned not simply whether or not you can memoize a particular type, but how. For example, for some problem you may know that you only want to memoize integers less than 1000, or that you want to use a fast array for those and a binary trie for the rest, or that you accept Eithers, and you want to memoize Lefts but not Rights. This is why, as the name suggests, data-memocombinators provides combinators for building nicely controlled memo tables.

I would like this library to grow, so if anybody has any suggestions or implementations for memo tables they use commonly, do send them along!

Laziness and the monad laws

Recently on haskell-cafe, the rather trivial module OneTuple was announced. I looked over the source and saw something that bugged me.

data OneTuple a = OneTuple { only :: a }

instance Functor OneTuple where
fmap f (OneTuple x) = OneTuple (f x)


I never write functor instances this way. This is too strict; I would have written it like this instead:

instance Functor OneTuple where
fmap f = OneTuple . f . only


Because it’s lazier. As I was composing a message explaining to the author why his instance was incorrect and why mine was correct, I realized something: his instance is correct and mine is incorrect! To see why, observe the functor laws:

  fmap id = id
fmap (f.g) = fmap f . fmap g


My implementation satisfies the second law, but violates the first: fmap id ⊥ = OneTuple ⊥!

The same thing happens with the monad laws; I would have written the monad instance this way:

  instance Monad OneTuple where
return = OneTuple
m >>= f = f (only m)


But this violates the right unit law:

  m >>= return = m


Since ⊥ >>= return = return (only ⊥) = OneTuple ⊥ /= ⊥

So, in summary: most Haskell hackers know not to make their functions too strict. But we have to be careful not to make our functions too lazy either, and not just from an efficiency standpoint!

General Update

My brand new Acer laptop’s video card died a few days ago. The computer still worked, I just couldn’t see anything. So, while it’s being fixed, I’m back to using my old piece of junk which refuses AC power while turned on. It has an old battery, so I’m limited to about an hour of usage before it forces me to take a break. Needless to say, this is not enough time to get any significant coding done.

Coq is totally awesome! I proved Rice’s theorem for the SK calculus, which was a bit more satisfying than proving that every natural is either even or odd. My proof is pretty ugly, I can see a lot of ways I could clean it up, but I forgive myself; I’m a noob.

Coq is making me very excited for dependent types. For example, toward the end of the aforementioned proof, I started to define some combinators so I could construct the “impossible program”:

 Definition compose := S (K S) K.
Lemma compose_really_does : forall f g x, compose f g x = f (g x).
(* ... *)
Qed.


I can envisage doing this kind of thing all the time in my regular programming practice. Write a little function, then prove that it does what I want it to. I’ve already begun exploring ways to use Coq for my every-day programming tasks, such as clean ways to hook into Haskell libraries with Extraction. Who knows how fruitful this endeavor will end up.

I have been making some interesting findings on the FRP front. Here’s an executive summary of each; I may or may not go into more detail in future posts.

(1) The semantics of behaviors are “functions of time”, but what the hell is time? In the real world, time has more structure than numbers. So I claim that in FRP what we’re actually constructing is causal transformations of time functions. A transformation R is causal if R(f)t = R(ft)t for all f, t, where the t subscript means “restricted to times before t”. As yet this has not led to any new implementation insights.

(2) FRP would be simple, easy to work with, and expressive even without its Event type, if only there were an efficient Monad instance for Behavior. For most of the cases where we would use Event, we substitute Behavior :. Future.

(3) Behavior has an efficient Monad instance, but it is not implementable in Haskell as it stands. It requires some support for linear types. Behavior itself is not linear per se, but each separate lexical instance is always accessed in a monotone pattern. So if multiple references were indicated by the compiler with split :: Behavior a -> (Behavior a, Behavior a), then you could destructively update each reference to the latest time it was accessed and avoid the expensive “fast-forwarding” necessary for the naive Monad instance. There are some other tricks, but that is the main idea.

(4) (What I’m currently exploring) It may be possible to get this efficiency without linear types by embedding the linearity in a monad, getting the “Behavior Monad Monad”. Ironically, the best candidate for doing this so far is the “Behavior :. Behavior Monad” (semantically). Whether or not it ends up working, I don’t think it will be very easy to work with.