Udon Sketch #2

The core library of Udon is basically finished (see my previous post for a sketch of what udon is). It needs cleanup, documentation, and a quickcheck suite, but the functionality is there and working in my small tests. I have begun udon-shell, which is a way you can interact with udon objects from the command line. Mostly it’s just a way to test things, but may become an integral helper tool for the more complex tools to come.

My favorite thing about the Udon core is that there is nary a mention of IO anywhere. It is a purely functional core, the way I was told real software is written when I was a Haskell child. :-)

So, where to go from here? I have two projects in mind, which may just be one project. One of them is to make a version control system which blurs the idea of a “project”, so I don’t care whether it’s one or two. The beautiful thing about this library is that I can model these projects in a purely functional way (as long as I don’t reference any functions in my persistent data structures), and the persistence and distributedness comes for free.

Here is how I picture the “cabal file” for the package manager.

type PackageName = [String]  -- list of components
type ImportList = [(PackageName, PackageName)] -- use module Foo, name it Bar
data Package = Package {
    name :: String,  -- whatever you want, no thought involved, rename at will
    depends :: [(ExtRef Package, ImportList)],
    ...
  }

Which is the essence. That packages are referred two by reference rather than by name, and modules from a package are explicitly imported and can be renamed to avoid conflicts. I’m not worrying about how this interacts with the state of the ghc art; it’ll work out.

I’ll come back to this in a bit. First I want to talk about smart versioning.

Smart versioning has little to do with Udon, but as long as I’m making a version control system I might as well play with it (though I doubt I have the attention span to carry it to its full potential). There are more types of content than just text and directories. For example, one might write a “Haskell source” plugin which understands Haskell source code, rather than just seeing it as a collection of lines. Then you could tell the plugin about the refactors you do, such as renaming a function. If that change is merged with a version from before that change, any true references to that function will get renamed (but shadows of that name would remain).

Taking this further, say there were a “Haskell package” content type. Then if you broke backward compatibility between two versions, you would annotate how. If it’s something simple like a renaming, the package manager could automatically upgrade your project to work with the new version. If it’s something complex like the semantics of a function, it could see if you ever used that function and mark the locations in your code that needed to be updated.

Such patch abstractions, which are of course custom depending on the content type, could be the basis for the version contol system. I think Darcs does this kind of thing too, I’m not sure. But I use git’s “object identity” theory rather than Darcs’s patch theory.

So, given smart versioning patches, what is a proper package spec?

I want to support:

  • A package allowed to depend on multiple versions of another, with forward compatibility as I just described.
  • Splitting a package into two.
  • Merging two packages into one.
  • Renaming stuff (of course).

A package can vary by any combination of its dependencies. My hunch is that there should be a different object for each combination of dependencies (together with versions). But we don’t store all of them. Instead we store a root package, presumably depending on whatever precise versions of dependencies it was developed with, and then derive the package for other sets of dependencies using package diffs. To ensure robustness, a test suite should be included in the package to make sure the derivations were correct (though diffs ought to be pretty sure of themselves to begin with).

As scary as it might seem, this is already better than cabal. With cabal, the package makes some wild declaration about which versions of its dependencies it can support. And it can lie. There is no automated test that checks whether it told the truth, there is no way to inform a package with dependencies too liberal that the semantics of a function changed while its type stayed the same (thus introducing subtle bugs). Basically for cabal to work, you should have checked a package against every dependency you declare. But nobody does that.

Also, what of the case where a package needs modification between two versions of one of its dependencies. You have to resort to gross conditional compilation stuff.

Hmm, a troublesome case comes to mind. Suppose the case I just described happens, where foo depends on bar. foo-A works with bar-A, and foo-B works with bar-B. Now I find a bug in foo. I don’t want to have to fix it twice. So let’s say I fix it in foo-A. What happens to foo-B? How does it get the fix.

Well, this is a purely functional package manager, so I can’t really fix it in foo-A. So let’s say I fix foo-A and it becomes foo-A’. Then what I need to do to foo-B to fix it is apply the patch foo-A’/foo-A, to get foo-B’. Hmm, I guess that’s it. It will be tricky for a package author to keep this dependency information coherent for other packages to use.

Okay, I need to get to bed. Final sketch: packages do not refer to other versions of themselves. They stand alone, referring to their dependencies by exact version. Then edges (noodles?) are published, together with a possible “upgrade direction” (you always want the target of this edge if you can). Then some clever algorithm tries to upgrade the package and packages that depend on it. If one of the dependents fails, you can try it with an appropriate edge for the dependent (if one exists).

Hmm, a nice thing this solves is the ability to resolve (to some extent) that horrid dependency on multiple versions of the same package, by renaming one of the versions to something else. Then it is apparent and resolvable, at least, when you are treating two types from different versions as equal when they are not.

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s