We are all familiar with Cantor’s diagonal argument that proves there exist infinite sets which are “larger” than the set of natural numbers. In this post I will show that we can express this argument in the form of a program, thus showing that there are countable sets which are “computably uncountable”.

I begin with the program itself:

type Cantor = Nat -> Bool diagonal :: (Nat -> Cantor) -> Cantor diagonal cs n = not (cs n n)

`Cantor` is “the cantor space”, the type of infinite sequences of booleans. We will call such an infinite sequence “a `Cantor`“. There are clearly infinitely many `Cantor`s; e.g. take the range of this function which gives `False` at every position except the one specified:

unit :: Nat -> Cantor unit m n = m == n

`diagonal` is (Georg) Cantor’s diagonal argument written as a program — it takes an alleged sequence of all `Cantor`s, and returns a `Cantor` which does not occur in the sequence, by construction. This function shows by contradiction that we cannot put `Cantor`s in 1 to 1 correspondence with naturals, and thus that there are *more* `Cantor`s than there are naturals.

So how many `Cantor`s are there? Since `Nat -> Bool` is a Haskell type — the type of *computable* functions from `Nat` to `Bool` — `Cantor`s must be representable by programs. We can encode programs as numbers by treating their source code as base-128 numbers. Hence, there are no more `Cantor`s than naturals, and so `Cantor`s can be put into 1 to 1 correspondence with naturals.

Wait — what? There are more `Cantor`s than `Nat`s, but they both have the same size? Something is wrong. Indeed, in the process of this argument we have asserted both

- “We cannot put
`Cantor`s in 1 to 1 correspondence with naturals” - “
`Cantor`s can be put into 1 to 1 correspondence with naturals”

We clearly can’t have both.

### I

The erroneous statement is (2). It is undecidable whether a given program represents a `Cantor`. If the `n`th `Cantor` is ⊥ at `n`, then `diagonal` will fail: `diagonal cs n = not (cs n n) = not ⊥ = ⊥`. Because ⊥ is a fixed point of `not`, `diagonal` cannot return an element different from the one it was given. Thus for `diagonal` to work, we must require that `Cantor`s be *fully-defined* — no infinite loops!

With this requirement, we can no longer put `Cantor`s in 1 to 1 correspondence with the naturals, because we would have to solve the halting problem. It is not enough that the type of the term is a `Cantor`, it now must be fully defined for all inputs, and determining that given arbitrary source code is an undecidable problem.

### II

The erroneous statement is (1). `Cantor`s are computable functions, so as we have argued, they have the same cardinality as the naturals. There are no more programs than numbers, so *by the definition* of equal cardinality we can put them in 1 to 1 correspondence with a function.

The problem with (1) occurs because `diagonal` takes as its first argument not an arbitrary sequence of `Cantor`s, but a *computable* sequence of `Cantor`s. If `cs` is not computable, then neither is `diagonal cs` (for we no longer have `cs`‘s source code with which to construct it), and `Cantor`s are defined to be computable sequences. So `diagonal` fails to contradict our bijection.

### III

The erroneous statement is (2). Section II claims to put `Cantor`s and naturals in 1 to 1 correspondence, but it is lying. Suppose Section II is formulated with respect to some axiom system *A*. If it were “telling the truth”, we would expect there to be some term *f* in the language of *A* such that for every fully defined `Cantor` program *c*, there is some natural number *n* such that we have (i.e. it is a theorem of *A* that *f(1 + 1 + … + 1) = (source code of c)*).

Let’s suppose we have written down the axioms of *A* into a Haskell program, and we have a (partial) function `proofSearch :: Nat -> Cantor`, which, given a number n, searches for theorems of the form and compiles and returns the first such *c* it finds. In the case that there is no such statement, it just runs forever; similarly for the case that *c* fails to compile. Although cumbersome to write, I’m sure we agree that this is *possible* to write. If section II is not lying, then we expect that for every natural *n*, `proofSearch n` *does* in fact return a valid `Cantor`.

Now, let us return to familiar lands with a new program:

evidence :: Cantor evidence = diagonal proofSearch

Oh my! If section II is the truth, then `proofSearch` is a *total, computable* function of type `Nat -> Cantor`, which we can pass to `diagonal` to find a `Cantor` that it missed! So it must have been lying, either (1) about its function `f` finding every possible `Cantor` or (2) about it actually possessing such a function (i.e. it “proved” that there is such a function, but it couldn’t actually represent it). In either case, it did not actually create a 1 to 1 correspondence between the naturals and `Cantor`s.

### IV

*Left as an exercise for the reader.*

Which one is it *really*?

Related: seemingly impossible functions

Your two statements are using the word ‘correspondence’ to mean different things. In (1) you mean a computable bijection, whereas in (2) you mean an arbitrary, possibly noncomputable, bijection. You can can put Cantors in 1-1 correspondence with the naturals, but any such correspondence is not computable.

This came at a very appropriate time for me, as I was discussing similar issues on a mailing list:

http://groups.google.com/group/magic-list/browse_thread/thread/a314e33d0bbcc1b5

This is relevant, too: Representations of uncomputable and uncountable sets.

I’m glad you’re giving this some attention again. This came up two years ago on Haskell Café (a discussion in which you participated as well) and it’s been in the back of my mind ever since.

http://www.mail-archive.com/haskell-cafe@haskell.org/msg52999.html