We are all familiar with Cantor’s diagonal argument that proves there exist infinite sets which are “larger” than the set of natural numbers. In this post I will show that we can express this argument in the form of a program, thus showing that there are countable sets which are “computably uncountable”.

I begin with the program itself:

type Cantor = Nat -> Bool diagonal :: (Nat -> Cantor) -> Cantor diagonal cs n = not (cs n n)

`Cantor` is “the cantor space”, the type of infinite sequences of booleans. We will call such an infinite sequence “a `Cantor`“. There are clearly infinitely many `Cantor`s; e.g. take the range of this function which gives `False` at every position except the one specified:

unit :: Nat -> Cantor unit m n = m == n

`diagonal` is (Georg) Cantor’s diagonal argument written as a program — it takes an alleged sequence of all `Cantor`s, and returns a `Cantor` which does not occur in the sequence, by construction. This function shows by contradiction that we cannot put `Cantor`s in 1 to 1 correspondence with naturals, and thus that there are *more* `Cantor`s than there are naturals.

So how many `Cantor`s are there? Since `Nat -> Bool` is a Haskell type — the type of *computable* functions from `Nat` to `Bool` — `Cantor`s must be representable by programs. We can encode programs as numbers by treating their source code as base-128 numbers. Hence, there are no more `Cantor`s than naturals, and so `Cantor`s can be put into 1 to 1 correspondence with naturals.

Wait — what? There are more `Cantor`s than `Nat`s, but they both have the same size? Something is wrong. Indeed, in the process of this argument we have asserted both

- “We cannot put
`Cantor`s in 1 to 1 correspondence with naturals” - “
`Cantor`s can be put into 1 to 1 correspondence with naturals”

We clearly can’t have both.

### I

The erroneous statement is (2). It is undecidable whether a given program represents a `Cantor`. If the `n`th `Cantor` is ⊥ at `n`, then `diagonal` will fail: `diagonal cs n = not (cs n n) = not ⊥ = ⊥`. Because ⊥ is a fixed point of `not`, `diagonal` cannot return an element different from the one it was given. Thus for `diagonal` to work, we must require that `Cantor`s be *fully-defined* — no infinite loops!

With this requirement, we can no longer put `Cantor`s in 1 to 1 correspondence with the naturals, because we would have to solve the halting problem. It is not enough that the type of the term is a `Cantor`, it now must be fully defined for all inputs, and determining that given arbitrary source code is an undecidable problem.

### II

The erroneous statement is (1). `Cantor`s are computable functions, so as we have argued, they have the same cardinality as the naturals. There are no more programs than numbers, so *by the definition* of equal cardinality we can put them in 1 to 1 correspondence with a function.

The problem with (1) occurs because `diagonal` takes as its first argument not an arbitrary sequence of `Cantor`s, but a *computable* sequence of `Cantor`s. If `cs` is not computable, then neither is `diagonal cs` (for we no longer have `cs`‘s source code with which to construct it), and `Cantor`s are defined to be computable sequences. So `diagonal` fails to contradict our bijection.

### III

The erroneous statement is (2). Section II claims to put `Cantor`s and naturals in 1 to 1 correspondence, but it is lying. Suppose Section II is formulated with respect to some axiom system *A*. If it were “telling the truth”, we would expect there to be some term *f* in the language of *A* such that for every fully defined `Cantor` program *c*, there is some natural number *n* such that we have (i.e. it is a theorem of *A* that *f(1 + 1 + … + 1) = (source code of c)*).

Let’s suppose we have written down the axioms of *A* into a Haskell program, and we have a (partial) function `proofSearch :: Nat -> Cantor`, which, given a number n, searches for theorems of the form and compiles and returns the first such *c* it finds. In the case that there is no such statement, it just runs forever; similarly for the case that *c* fails to compile. Although cumbersome to write, I’m sure we agree that this is *possible* to write. If section II is not lying, then we expect that for every natural *n*, `proofSearch n` *does* in fact return a valid `Cantor`.

Now, let us return to familiar lands with a new program:

evidence :: Cantor evidence = diagonal proofSearch

Oh my! If section II is the truth, then `proofSearch` is a *total, computable* function of type `Nat -> Cantor`, which we can pass to `diagonal` to find a `Cantor` that it missed! So it must have been lying, either (1) about its function `f` finding every possible `Cantor` or (2) about it actually possessing such a function (i.e. it “proved” that there is such a function, but it couldn’t actually represent it). In either case, it did not actually create a 1 to 1 correspondence between the naturals and `Cantor`s.

### IV

*Left as an exercise for the reader.*

Which one is it *really*?

Related: seemingly impossible functions

Your two statements are using the word ‘correspondence’ to mean different things. In (1) you mean a computable bijection, whereas in (2) you mean an arbitrary, possibly noncomputable, bijection. You can can put Cantors in 1-1 correspondence with the naturals, but any such correspondence is not computable.

This came at a very appropriate time for me, as I was discussing similar issues on a mailing list:

http://groups.google.com/group/magic-list/browse_thread/thread/a314e33d0bbcc1b5

This is relevant, too: Representations of uncomputable and uncountable sets.

I’m glad you’re giving this some attention again. This came up two years ago on Haskell CafĂ© (a discussion in which you participated as well) and it’s been in the back of my mind ever since.

http://www.mail-archive.com/haskell-cafe@haskell.org/msg52999.html