I would like to address a statement by Paul Snively during a Twitter conversation.
The notion that math has meaning apart from being computable is perverse.
I admit that all involved in the discussion are computer scientists, and so we would be predisposed toward constructivism and, in particular, this kind of value system. Indeed, I consider myself “a constructivist” — in the right setting, I will fiercely argue with a classicist that “there exists” does not mean what they think it means — but I will not go this far.
The underlying idea of constructivism is that the form of the theorem describes some evidence that must be used to support it. Evidence for (P or Q) needs either evidence for P or evidence for Q; evidence for (exists x such that P(x)) needs a concrete value (e.g. a number) x and evidence for P(x) for that x; evidence for (P implies Q) is a function that maps evidence for P into evidence for Q; and so on. The two theorems (1) “given any integer, there is a prime number larger than it” and (2) “there are not finitely many prime numbers” are in fact different statements. The evidence for (1) must be a computable function which maps integers to prime numbers which are larger; the evidence for (2) is a function which takes evidence that there are finitely many prime numbers (essentially an exhaustive list) and produces a contradiction. (2) is the form of Euclid’s famous proof, but it is not as strong as (1), which gives a computable process that generates primes. Idiomatically we would call (1) constructive and (2) non-constructive, but the finer distinction is that constructive mathematics distinguishes these two statements while classical mathematics considers them equivalent.
In practice, this means that you cannot use proof by contradiction, the identity that “¬∀x. P(x)” implies “∃x. ¬P(x)”, or the axiom of choice (which claims the existence of a function without giving a way to compute it). The necessary evidence can be extracted from proofs constructed using the remaining axioms.
If you alter the laws of evidence, you can recover the law of excluded middle (proof by contradiction), which says that for any proposition P, (P or not P) is true. Classical mathematicians consider it true that “there are either infinitely many or finitely many twin primes”. Constructively, however, this says if you have a proof of this statement, then you either have a proof of P or a proof of (not P). At the time of writing, this is not true of whether there are infinitely many twin primes; we do not yet have a proof either way. But if you allow into your language of evidence the ability to invoke continuations, then we do have such evidence: the one we have is a proof of (not P), which is a function that takes evidence for P and produces a contradiction. So you pass this function evidence for P because you need the contradiction, but instead of giving you the contradiction you wanted it goes back in time to change its mind, now saying that the evidence for (P or not P) is the evidence for P (which it was just given). Yes, it’s ridiculous, but could be considered constructive if you have a different slant on the meaning of evidence.
But don’t be so hasty in using this ridiculous interpretation against the law of excluded middle. The Ackermann function — a function which grows extremely fast — is constructively definable. However, A(4,4) is far, far greater than the number of elementary particles in the known universe. Using the numeral for A(4,4) as evidence is physically impossible, and yet it is considered valid constructive evidence. This puts constructivism on less sound scientific footing: a constructive theorem need not have actual evidence, it need only have evidence in principle. But what principle? How can one justify that A(4,4) is more real than a well-ordering of the real numbers? — we can give concrete evidence for neither. The proof that A(4,4) is a well-defined number relies on abstract reasoning founded in the same logical ideas that gave rise to the law of excluded middle.
This style of argument is associated with ultrafinitism — the idea that even very large finite numbers may not exist (pay attention to the word may — the idea that a finite number does not exist is intentionally outside the realm of ultrafinitism’s ability to answer). Classical mathematics says there exist arbitrary choice functions, constructive mathematics says those may not exist but A(4,4) does, ultrafinitism says that A(4,4) (and sometimes even numbers as small as 2100) may not exist. These distinctions seem all to be rooted in a sort of fight over which Platonic abstract concepts exist. Perhaps some, such as my friend Paul, would say “are meaningful” instead, but it’s the same idea. It is not as if only one of these philosophies has information to extract. Ultrafinite arguments construct observable evidence, constructive arguments construct idealized evidence, classical arguments discuss idealized existence. If you were to rephrase a classical existence argument “there exists a non-recursive set of integers” to “not every set of integers is recursive” then it becomes constructively valid (it is a diagonal argument). In fact, every sentence provable in classical logic has a corresponding sentence provable in constructive logic, by a simple syntactic transformation. I find it, then, irrational to consider that the former be meaningless and the latter meaningful. So we are merely arguing over the semantics of the word “exists” (and in general, “or”, “implies”, etc. as well). We are arguing about what the sentences mean, not whether they are meaningful. Classical existence is different than constructive existence, and neither corresponds to physical existence.
Paul says, “you can’t actually cut a sphere up and reassemble it into two spheres identical in volume to the first.” I respond by saying that you can’t actually separate an arbitrarily small piece of a box either (first of all, a box made of what?), which constructively is allowed. Mathematics is about mentally idealized objects — if we can “in principle” separate an arbitrary piece of a box, then we can also “in principle” cut a sphere up and reassemble it into two spheres of identical volume, they are merely operating by different principles. Fortunately, we can examine the proofs of these theorems to find out by which principles they are operating. But if you are going to bless one “in principle” as meaningful and banish the others — which I beg you not to — I can see no other way than to resort to the physical reality we are given, to ultrafinitism.
We are all familiar with Cantor’s diagonal argument that proves there exist infinite sets which are “larger” than the set of natural numbers. In this post I will show that we can express this argument in the form of a program, thus showing that there are countable sets which are “computably uncountable”.
I begin with the program itself:
type Cantor = Nat -> Bool diagonal :: (Nat -> Cantor) -> Cantor diagonal cs n = not (cs n n)
Cantor is “the cantor space”, the type of infinite sequences of booleans. We will call such an infinite sequence “a Cantor“. There are clearly infinitely many Cantors; e.g. take the range of this function which gives False at every position except the one specified:
unit :: Nat -> Cantor unit m n = m == n
diagonal is (Georg) Cantor’s diagonal argument written as a program — it takes an alleged sequence of all Cantors, and returns a Cantor which does not occur in the sequence, by construction. This function shows by contradiction that we cannot put Cantors in 1 to 1 correspondence with naturals, and thus that there are more Cantors than there are naturals.
So how many Cantors are there? Since Nat -> Bool is a Haskell type — the type of computable functions from Nat to Bool — Cantors must be representable by programs. We can encode programs as numbers by treating their source code as base-128 numbers. Hence, there are no more Cantors than naturals, and so Cantors can be put into 1 to 1 correspondence with naturals.
Wait — what? There are more Cantors than Nats, but they both have the same size? Something is wrong. Indeed, in the process of this argument we have asserted both
- “We cannot put Cantors in 1 to 1 correspondence with naturals”
- “Cantors can be put into 1 to 1 correspondence with naturals”
We clearly can’t have both.
The erroneous statement is (2). It is undecidable whether a given program represents a Cantor. If the nth Cantor is ⊥ at n, then diagonal will fail: diagonal cs n = not (cs n n) = not ⊥ = ⊥. Because ⊥ is a fixed point of not, diagonal cannot return an element different from the one it was given. Thus for diagonal to work, we must require that Cantors be fully-defined — no infinite loops!
With this requirement, we can no longer put Cantors in 1 to 1 correspondence with the naturals, because we would have to solve the halting problem. It is not enough that the type of the term is a Cantor, it now must be fully defined for all inputs, and determining that given arbitrary source code is an undecidable problem.
The erroneous statement is (1). Cantors are computable functions, so as we have argued, they have the same cardinality as the naturals. There are no more programs than numbers, so by the definition of equal cardinality we can put them in 1 to 1 correspondence with a function.
The problem with (1) occurs because diagonal takes as its first argument not an arbitrary sequence of Cantors, but a computable sequence of Cantors. If cs is not computable, then neither is diagonal cs (for we no longer have cs‘s source code with which to construct it), and Cantors are defined to be computable sequences. So diagonal fails to contradict our bijection.
The erroneous statement is (2). Section II claims to put Cantors and naturals in 1 to 1 correspondence, but it is lying. Suppose Section II is formulated with respect to some axiom system A. If it were “telling the truth”, we would expect there to be some term f in the language of A such that for every fully defined Cantor program c, there is some natural number n such that we have (i.e. it is a theorem of A that f(1 + 1 + … + 1) = (source code of c)).
Let’s suppose we have written down the axioms of A into a Haskell program, and we have a (partial) function proofSearch :: Nat -> Cantor, which, given a number n, searches for theorems of the form and compiles and returns the first such c it finds. In the case that there is no such statement, it just runs forever; similarly for the case that c fails to compile. Although cumbersome to write, I’m sure we agree that this is possible to write. If section II is not lying, then we expect that for every natural n, proofSearch n does in fact return a valid Cantor.
Now, let us return to familiar lands with a new program:
evidence :: Cantor evidence = diagonal proofSearch
Oh my! If section II is the truth, then proofSearch is a total, computable function of type Nat -> Cantor, which we can pass to diagonal to find a Cantor that it missed! So it must have been lying, either (1) about its function f finding every possible Cantor or (2) about it actually possessing such a function (i.e. it “proved” that there is such a function, but it couldn’t actually represent it). In either case, it did not actually create a 1 to 1 correspondence between the naturals and Cantors.
Left as an exercise for the reader.
Which one is it really?