Haskell’s Big Three

As my regular readers know, I am a zealous Haskell advocate. After many years of programming in many different languages, Haskell has secured itself as my #1 language for almost every problem: using any other is like painting with a mop — like playing tetris without being able to rotate your pieces. Nonetheless, Haskell has some unsightly areas, especially when considering programming in the large. The more I use Haskell to tackle big problems, the more obnoxious they become. This post is my account of Haskell’s Big Three annoyances. Contrary to my usual shtick about cutting out features because fewer features means more properties, these are missing features. The order in which I list them is meaningless.

1. No Module Abstraction

Some of the code I am writing for Evo (purely functional RTS game), if stated in full generality, would look like this:

gameUI :: (RealFrac time, Image image, Causal causal) => UI time image causal () ()

The first three parameters are constant throughout my entire program. I can’t in my right mind make a data type with 5 type parameters, and I certainly refuse to write those same three constraints on every function involving UI. So, sacrificing generality, I have fixed those three parameters to the particular choices I need for Evo. But suppose this module were to be reused — by fixing my choice of, eg., image to Graphics.DrawingCombinators.Image, I have disallowed the reuse of this module by any program that uses another library for drawing. But I stand by my aesthetic choice, so that I don’t obscure my code’s inner beauty behind repulsive, indecipherable type signatures.

Haskell needs some form of abstraction over modules. ML functors, Agda modules (aka. awesome records), Coq sections — any way to abstract over more than one definition. This would allow module authors to make their code clean and reusable at the same time.

2. Module Naming Policy

Hackage is a mess. I cringe looking at vector-space, taking up 9 valuable top-level names (“Data.” doesn’t count, everybody uses that), preventing anyone else from naming a module or module suite named Data.Cross or Data.Derivative (the latter I actually have a candidate for). data-object for JSON objects, as if nothing else could concievably be called an object. transformers and mtl both have a module named Control.Monad.Trans, preventing both versions from being used from the same package (suppose this package depends on two other packages, each of which depending on a different monad library).

The quick fix, what every other language has done, is to institute a policy or culture of naming conventions that makes the probability of collision low. I feel like Hackage is nearing its limit — the ad-hoc Data.Blah naming conventions won’t last through another order of magnitude. If we encouraged people to name packages more conservatively, we may last through one or two more.

But a naming convention is just delaying the inevitable, giving us a false sense of security. What happens when a package is forked, two packages come to depend on two different branches of this package which forgot to rename itself, and a third package wants to depend on both of those? We need something innovative, and in Haskell’s spirit! Let’s allow the flexibility in package names that we allow in symbol names from imports now — if there is a name collision, just rename one of them for the scope of your package. No big deal. Let module authors name their modules whatever they think is clearest — go ahead, name your module Monad, we don’t care. Right now, a name like that would be a death sentence for that module due to impoliteness.

3. Unrefactorable Typeclasses

The Haskell prelude is a very nice place, in general (what, you haven’t read it? Go, it’s a nice read!). However there is one cesspool of which every experienced Haskeller is ashamed: the Num class. Let’s look:

class  (Eq a, Show a) => Num a  where
    (+), (-), (*)    :: a -> a -> a
    negate           :: a -> a
    abs, signum      :: a -> a
    fromInteger      :: Integer -> a

(+) and (-) almost surely belong, as does fromInteger (supposing you don’t allow more flexibility with numeric literals a la IsString). (*) might be considered reasonable, though we have just disallowed vectors as instances of Num. abs and signum… well I guess as long as you’re putting in (*) those make sense. But the superclasses Eq and Show, especially Show, what? What about being a number involves being convertible to a string? Eq and Show both disallow computable reals from being a bona fide instance of Num, they have to cheat. Same goes for notational extensions like f Int where f is any applicative functor.

The Num class is pretty bad, but I excuse the Haskell designers for making that mistake. Library design is hard, especially when you have never used the langauge you are designing the library for. The weakness in Haskell is much deeper: we are stuck with this cruft because any attempt to change it would break all sorts of code everywhere. Num isn’t the only class affected, it is just the most public example. Any module which defines a typeclass and allows users to instantiate it suffers from this problem. As soon as the typeclass is defined, it may no longer change. You are stuck with that decision forever, or else you break backwards compatibility.

Ultimately what we would like to do is split Num up into algebraic structures like Group (with (+), (-)), Ring (with (*)), etc. There is a simple, conservative proposal that could solve this problem, which essentially would allow us to say:

class alias Num a = (Eq a, Show a, Field a, IsInteger a)

Which would allow all four of those typeclasses to be instantiated in one go, thus treating the collection of four as one, and allowing us to introduce finer granularity in the Num typeclass without breaking everything. GHC has implemented all kinds of crazy extensions, and this seems tame in comparison. I wonder what is preventing it. (Maybe I should get into some GHC hacking?)

I have identified further typeclass modularity problems could be solved by a more radical suggestion: outlaw orphan instances. But that fight is for another day.

Conclusion

I consider Haskell to be by a wide margin the prettiest practical language in existence. Haskell code in the small is often perfect to my eyes, “from the book” as Erdös would say. Each of these suggestions is about improving Haskell code in the large. They allow the incremental creation of a perfect module, a perfect package, a perfect typeclass. We are closer than we think to a language in which an entire program could be considered perfect.

I wrote this article to emphasize these features, open them to active discussion and critique, and entice their implementation. What do you think, are they worth the effort?

About these ads

14 thoughts on “Haskell’s Big Three

  1. For the record, data-object is not limited to JSON; I was explaining the choice of the term object being the same as the definition used in JSON data. data-object has a datatype for representing three types of data:

    Scalar
    Sequence
    Mapping

    I think you can find those sufficiently general to warrant the name.

  2. IMHO the class alias proposal is not really simple. The key problem is how to partition definitions in a class instance declaration between the various classes the type class is an alias for.

    There is lots of potential for ambiguity – remember that you might mention a single type class several times in the alias. Food for thought: is the program fragment acceptable?

    class Foo a b where
    aonly :: a -> a
    andab :: a -> b -> a

    class Bar b = (Foo Int b, Foo String b)

    instance Bar Int where
    aonly :: Int -> Int
    aonly = id

    aonly :: String -> String
    aonly = (‘a':)

    Simple idea: only allow a single occurrence of any one class name in the fully expanded constraint. This might complex in the presence of extensions like constraint families (http://blog.omega-prime.co.uk/?p=61) however, because you don’t know at the definition site which classes you are talking about.

    Class aliases just seem a bit ad-hoc to me.

  3. Thanks for taking the time to highlight these problems. The first two on your list are things that struggling with a lot too. As for classes it’s not something I’ve had lots of problems with but I nevertheless agree that something should be done about it.

  4. The first and third items here I completely agree with. For the second, I’m not sure whether you’re complaining about a lack of namespacing of module names or package names. The module names “problem” is, I think, a non-issue, due to GHC’s -XPackageImports:

    import “mtl” qualified Control.Monad.Trans as MTL
    import “transformers” qualified Control.Monad.Trans as Transformers

    For the record, I *really* like that we can use nice, simple, clear module names like these (and that we can omit the package name if the compiler can work it out).

    For packages, I agree we need a convention. Might I suggest “yourpackage.yourdomain.org”? Please let’s avoid the horrible Java-style backwards domain names…

  5. Just to mention my peeve about Prelude peeves: if the Num class is going to be refactored, it’s (-), negate, abs, and signum which don’t belong. In my work I’m constantly running into semirings and rarely dealing with groups. I think (+) and (*) belong together far moreso than (+) and (-). I mean, ideally we’d have a nice way for semigroups, monoids, groups, semirings, rings, fields,… to all live together in harmony. But it always irks me when folks ignore basics like semirings and go bounding off after vector spaces and other things of that ilk.

  6. I’m not a Haskell expert, but couldn’t you solve the first problem by using a partially applied type? “Real World Haskell” gives this example:

    type SimpleState s a = s -> (a, s)
    type StringState a SimpleState String a

  7. Type synonyms go quite a way for problem number one, I’d think? Also, I don’t necessarily know if its that bad for something with a general type to look like a general type. Not that a better modules system wouldn’t be better.

    And, as was pointed out, ghc has a feature to ameliorate problem two. Although a more flexible namespace convention would still be a good thing.

    Something like class aliases (and a refactoring of Num) is certainly needed.

  8. Great post.
    1. I resolve issues like this using a custom literate preprocessor. Five minutes to tweak it to get around an annoyance. Here documents. Bird tracks that look like bird droppings, replaced by “indent code, flush comments”.
    2. I learned a great work-around from the comments. Thanks.
    3. I don’t think any fixed solution will make me happy. In my dream language I get to remap + and * without anyone else’s preconceived notion of the “right” answer interfering. Of course I can in Haskell, but only by going way off the ranch.

  9. You can get away from your first problem at least:

    gameUI :: (RealFrac time, Image image, Causal causal) => UI time image causal ()

    could be refactored using class associated types

    class (RealFrac (Time a), Image (ImageFormat a), Causal (CausalDomain a)) => Config a where
    type Time a :: *
    type ImageFormat a :: *
    type CausalDomain a :: *

    then you can build your UI with

    data UI a b = UI … (Time a -> ImageFormat a) b

    and your signatures simplify to one parameter:

    gameUI :: Config a => UI a ()

    This of course doesn’t preclude the requirement that you be sufficiently prescient to know you’ll need to parameterize all of your code on such an argument.

    As for the other two, they are definitely glaring problems.

  10. I like this post. Actually I was afraid that there is a need for complete remaking of Haskell. :)
    preventing anyone else from naming a module or module suite named Data.Cross or Data.Derivative
    So, a module name ( @:: [String]@ ) should begin with a ‘String’ which points to the author of the module. Java uses a DNS name of author’s website, but IMHO author’s public PGP key would be better. (Because your DNS registrar can take back your DNS name. And DNS here is superfluous bureaucracy.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s