I’m kind of tired, so don’t expect this post to be entirely coherent. I just wanted to write a little bit about the project I started recently, called Udon.
The idea came to me as the common ground between several problems I have seen recently. I will not go into the intricate details of why I consider them problems worth addressing. They are:
- The namespace problem on Hackage, CPAN, and related module collections. Specifically, ridding ourselves of the rigid, arbitrary taxonomy of packages without the root namespace getting uselessly polluted.
- The fact that no popular DVCS allows moving files between projects while preserving history in a sane way.
- The desire for a purely functional operating system.
Incidentally, these are programming culture problems #2,#3, and #4 on my list (#1 is, of course, FRP). I’m glad I took a break from FRP so I could see this common ground.
Udon stands for “Universal Distributed Object Notation”, with a tip o’ the hat to JSON. And yes, it is a system for serializing objects. It is a gigantic distributed store for inter-referential stateless objects, and would support a huge distributed data structure with no single computer containing all of it.
I’ll get to the implementation details in a minute. I’ll first address how it helps to solve the problems above.
It will help in the implementation of a DVCS that does support moving files between projects, by more or less eliminating the idea of a project. The versioning is all in commit objects, which refer to changes in some tree data structure (representing the directory tree of the project). This is how git works. The difference is that there is no restriction about what tree this is; it could be a tree “from another project”. Just picture how you would write git as pure Haskell, never having to touch the filesystem or anything, everything being objects in memory.
This DVCS, or something similar, helps to solve the namespace problem. A project could refer to the modules it uses by identity rather than name. So instead of the cabal file saying I depend on “binary”, the cabal file (well, data structure in the new model) would say I depend on “that thing over there”, and then import the modules from that package using an advanced import mechanism, which allows renaming or qualifying modules, etc. to avoid collisions.
After conceiving the idea, I realized that it is essentially the filesystem for a typed, purely functional operating system, which is another one of my fancies.
Okay, that’s probably more than enough vagueness for you math types.
The implementation is based on a flawed concept; the same flawed concept that all the DVCSes use, that the hash of an object is a unique identifier for that object. Basically we pretend that hash collisions don’t exist. Yuck. But I could not think of any other way to do it. (This may be the thing that prevents me from going very far with this project, since I am getting in the habit of proving my code correct, and I will never be able to do that with this code since it is not correct).
There is a special type called
ExtRef a is semantically just a value of type a. The kicker is that it might be on someone else’s machine, on your usb drive, or even lost forever because nobody kept its data around. It is represented as
data ExtRef a = ExtRef Hash (Maybe a).
Hash is the “unique identifier” for the object, and you always have that. You might have an actual copy of the object too, but if not, the Hash is there so you can find it if you need it. This representation, and the whole Hashing idea, is hidden from the high-level interface, of course.
You work with
ExtRefs through the External monad. It’s a coroutine monad (free monad over i × (o -> •)) which reports which ExtRefs need their associated objects to continue the computation. The low-level interface will allow implementation of a variety of ways to handle this. The simplest one is a local store of objects to be queried. As things get more mature, that could be expanded into one that looks at various remote network stores (based on some origin map or something), or asks the user where to look, etc.
And my hope is that the little Udon framework I’ve described here will be sufficient to model the DVCS and the hackage package system as if everything were just pure data structures in memory. Of course it won’t be quite as convenient (you need to annotate your data structures with some ExtRefs at least), but it would at least encourage pure modeling of these “real world” systems.