Today I want to talk about this snippet:
This program ought to be well-behaved — it has no recursion (or recursion-encoding tricks), no undefined or error, no incomplete pattern matches, so we should expect our types to be theorems. And yet we can get inconsistent. What is going on here?
Exercise: Identify the culprit before continuing.
The problem lies in the interaction between GADTs and generalized newtype deriving. Generalized newtype deriving seems to be broken here — we created a type B which claims to be just like A including instances, but one of A‘s instances relied on it being exactly equal to A. And so we get a program which claims to have non-exhaustive patterns (in unSwitchB), even though the pattern we omitted should have been impossible. And this is not the worst that generalized newtype deriving can do. When combined with type families, it is possible to write unsafeCoerce. This has been known since GHC 6.7.
In this post I intend to explore generalized newtype deriving and GADTs more deeply, from a more philosophical perspective, as opposed to just trying to plug this inconsistency. There are a few different forces at play, and by looking at them closely we will see some fundamental ideas about the meaning of types and type constructors.
Generalized newtype deriving seems reasonable to us by appealing to an intuition: if I have a type with some structure, I can clone that structure into a new type — basically making a type synonym that is a bit stricter about the boundaries of the abstraction. But the trouble is that you can clone parts of the structure without other parts; e.g. if X is an applicative and a monad, and I declare newtype Y a = Y (X a) deriving (Monad), then go on to define a different Applicative instance, I have done something wrong. Monad and applicative are related, so you can’t just change them willy nilly as though they were independent variables. But at the very least it seems reasonable that you should be able to copy all the structure, essentially defining a type synonym but giving it a more rigorous abstraction boundary. But in Haskell, this is not possible, and that is because, with extensions such as GADTs and type families, not all of a type’s structure is clonable.
I’m going to be talking a lot about abstraction. Although the kind of abstraction I mean here is simple, it is one of the fundamental things we do when designing software. To abstract a type is to take away some of its structure. We can abstract Integer to Nat by taking away the ability to make negatives — we still represent as Integer, but because the new type has strictly fewer operations (it must be fewer — after all we had to implement the operations somehow!) we know more about its elements, and finely tuning that knowledge is where good software engineering comes from.
When implementing an abstraction, we must define its operations. An operation takes some stuff in terms of that abstraction and gives back some stuff in terms of that abstraction. Its implementation must usually use some of the structure of the underlying representation — we define addition on Nat by addition on Integer. We may take it for granted that we can do this; for example, we do not have trouble defining:
sum :: [Nat] -> Nat
even though we are not given any Nats directly, but instead under some type constructor ().
One of the properties of type constructors that causes us to take this ability to abstract for granted is that if A and B are isomorphic (in a sense that will become clear in a moment), then F A and F B should also be isomorphic. Since we, the implementers of the abstraction, are in possession of an bijection between Nats and the Integers that represent them, we can use this property to implement whatever operations we need — if they could be implemented on Integer, they can be implemented on Nat.
This isomorphism property looks like a weak version of saying that F is a Functor. Indeed, F is properly a functor from a category of isomorphisms in which A and B are objects. Every type constructor F is a functor from some category; which category specifically depends on the structure of F. F's flexibility to work with abstractions in its argument is determined by that category, so the more you can do to that category, the more you can do with F. Positive and negative data types have all of Hask as their source category, so any abstractions you make will continue to work nicely under them. Invariant functors like Endo require bijections, but fortunately when we use newtype to create abstractions, we have a bijection. This is where generalized newtype deriving gets its motivation -- we can just use that bijection to substitute the abstraction for its representation anywhere we like.
But GADTs (and type families) are different. A functor like Switch b has an even smaller category as its domain: a discrete category. The only thing which is isomorphic to A in this category is A itself -- whether there is a bijection is irrelevant. This violates generalized newtype deriving's assumption that you can always use bijections to get from an abstraction to its representation and back. GADTs that rely on exact equality of types are completely inflexible in their argument, they do not permit abstractions. This, I claim, is bad -- you want to permit the user of your functor to make abstractions.
(Aside: If you have a nice boundary around the constructors of the GADT so they cannot be observed directly, one way to do this when using GADTs is to simply insert a constructor that endows it with the necessary operation. E.g. if you want it to be a functor from Hask, just insert
Fmap :: (a -> b) -> F a -> F b
If you want it to be a functor from Mon (category of monoids), insert:
Fmap :: (Monoid n) => MonoidHom m n -> F m -> F n
(presumably F m already came with a `Monoid` dictionary). These, I believe, are free constructions -- giving your type the structure you want in the stupidest possible way, essentially saying "yeah it can do that" and leaving it to the consumers of the type to figure out how.)
In any case, we are seeing something about GADTs specifically that simple data types do not have -- they can give a lot of different kinds of structure to their domain, and in particular they can distinguish specific types as fundamentally different from anything else, no matter how similarly they may behave. There is another way to see this: defining a GADT which mentions a particular type gives the mentioned type unclonable structure, such that generalized newtype deriving and other abstraction techniques which clone some of a type's structure no longer succeed.