Monthly Archives: March 2010

Haskell’s Big Three

As my regular readers know, I am a zealous Haskell advocate. After many years of programming in many different languages, Haskell has secured itself as my #1 language for almost every problem: using any other is like painting with a mop — like playing tetris without being able to rotate your pieces. Nonetheless, Haskell has some unsightly areas, especially when considering programming in the large. The more I use Haskell to tackle big problems, the more obnoxious they become. This post is my account of Haskell’s Big Three annoyances. Contrary to my usual shtick about cutting out features because fewer features means more properties, these are missing features. The order in which I list them is meaningless.

1. No Module Abstraction

Some of the code I am writing for Evo (purely functional RTS game), if stated in full generality, would look like this:

gameUI :: (RealFrac time, Image image, Causal causal) => UI time image causal () ()

The first three parameters are constant throughout my entire program. I can’t in my right mind make a data type with 5 type parameters, and I certainly refuse to write those same three constraints on every function involving UI. So, sacrificing generality, I have fixed those three parameters to the particular choices I need for Evo. But suppose this module were to be reused — by fixing my choice of, eg., image to Graphics.DrawingCombinators.Image, I have disallowed the reuse of this module by any program that uses another library for drawing. But I stand by my aesthetic choice, so that I don’t obscure my code’s inner beauty behind repulsive, indecipherable type signatures.

Haskell needs some form of abstraction over modules. ML functors, Agda modules (aka. awesome records), Coq sections — any way to abstract over more than one definition. This would allow module authors to make their code clean and reusable at the same time.

2. Module Naming Policy

Hackage is a mess. I cringe looking at vector-space, taking up 9 valuable top-level names (“Data.” doesn’t count, everybody uses that), preventing anyone else from naming a module or module suite named Data.Cross or Data.Derivative (the latter I actually have a candidate for). data-object for JSON objects, as if nothing else could concievably be called an object. transformers and mtl both have a module named Control.Monad.Trans, preventing both versions from being used from the same package (suppose this package depends on two other packages, each of which depending on a different monad library).

The quick fix, what every other language has done, is to institute a policy or culture of naming conventions that makes the probability of collision low. I feel like Hackage is nearing its limit — the ad-hoc Data.Blah naming conventions won’t last through another order of magnitude. If we encouraged people to name packages more conservatively, we may last through one or two more.

But a naming convention is just delaying the inevitable, giving us a false sense of security. What happens when a package is forked, two packages come to depend on two different branches of this package which forgot to rename itself, and a third package wants to depend on both of those? We need something innovative, and in Haskell’s spirit! Let’s allow the flexibility in package names that we allow in symbol names from imports now — if there is a name collision, just rename one of them for the scope of your package. No big deal. Let module authors name their modules whatever they think is clearest — go ahead, name your module Monad, we don’t care. Right now, a name like that would be a death sentence for that module due to impoliteness.

3. Unrefactorable Typeclasses

The Haskell prelude is a very nice place, in general (what, you haven’t read it? Go, it’s a nice read!). However there is one cesspool of which every experienced Haskeller is ashamed: the Num class. Let’s look:

class  (Eq a, Show a) => Num a  where
    (+), (-), (*)    :: a -> a -> a
    negate           :: a -> a
    abs, signum      :: a -> a
    fromInteger      :: Integer -> a

(+) and (-) almost surely belong, as does fromInteger (supposing you don’t allow more flexibility with numeric literals a la IsString). (*) might be considered reasonable, though we have just disallowed vectors as instances of Num. abs and signum… well I guess as long as you’re putting in (*) those make sense. But the superclasses Eq and Show, especially Show, what? What about being a number involves being convertible to a string? Eq and Show both disallow computable reals from being a bona fide instance of Num, they have to cheat. Same goes for notational extensions like f Int where f is any applicative functor.

The Num class is pretty bad, but I excuse the Haskell designers for making that mistake. Library design is hard, especially when you have never used the langauge you are designing the library for. The weakness in Haskell is much deeper: we are stuck with this cruft because any attempt to change it would break all sorts of code everywhere. Num isn’t the only class affected, it is just the most public example. Any module which defines a typeclass and allows users to instantiate it suffers from this problem. As soon as the typeclass is defined, it may no longer change. You are stuck with that decision forever, or else you break backwards compatibility.

Ultimately what we would like to do is split Num up into algebraic structures like Group (with (+), (-)), Ring (with (*)), etc. There is a simple, conservative proposal that could solve this problem, which essentially would allow us to say:

class alias Num a = (Eq a, Show a, Field a, IsInteger a)

Which would allow all four of those typeclasses to be instantiated in one go, thus treating the collection of four as one, and allowing us to introduce finer granularity in the Num typeclass without breaking everything. GHC has implemented all kinds of crazy extensions, and this seems tame in comparison. I wonder what is preventing it. (Maybe I should get into some GHC hacking?)

I have identified further typeclass modularity problems could be solved by a more radical suggestion: outlaw orphan instances. But that fight is for another day.

Conclusion

I consider Haskell to be by a wide margin the prettiest practical language in existence. Haskell code in the small is often perfect to my eyes, “from the book” as Erdös would say. Each of these suggestions is about improving Haskell code in the large. They allow the incremental creation of a perfect module, a perfect package, a perfect typeclass. We are closer than we think to a language in which an entire program could be considered perfect.

I wrote this article to emphasize these features, open them to active discussion and critique, and entice their implementation. What do you think, are they worth the effort?