# Searchable Data Types

A few years ago, Martín Escardó wrote an article about a seemingly-impossible program that can exhaustively search the uncountably infinite "Cantor space" (infinite streams of bits). He then showed that spaces that can be thus searched form a monad (which I threw onto hackage), and wrote a paper about the mathematical foundations which is seemingly impossible to understand.

Anyway, I thought I would give a different perspective on what is going on here. This is a more operational account, however some of the underlying topology might peek out for a few seconds. I’m no topologist, so it will be brief and possibly incorrect.

Let’s start with a simpler infinite space to search. The lazy naturals, or the one-point compactification of the naturals:

``````data Nat = Zero | Succ Nat
deriving (Eq,Ord,Show)
infinity = Succ infinity
``````

Let’s partially instantiate `Num` just so we have something to play with.

``````instance Num Nat where
Zero   + y = y
Succ x + y = Succ (x + y)
Zero   * y = Zero
Succ x * y = y + (x * y)
fromInteger 0 = Zero
fromInteger n = Succ (fromInteger (n-1))
``````

We wish to construct this function:

``````search :: (Nat -> Bool) -> Maybe Nat
``````

Which returns a `Nat` satisfying the criterion if there is one, otherwise `Nothing`. We assume that the given criterion is total, that is, it always returns, even if it is given `infinity`. We’re not trying to solve the halting problem. :-)

Let’s try to write this in direct style:

``````search f | f Zero    = Just Zero
| otherwise = Succ <\$> search (f . Succ)
``````

That is, if the predicate worked for `Zero`, then `Zero` is our guy. Otherwise, see if there is an `x` such that `f (Succ x)` matches the predicate, and if so, return `Succ x`. Make sense?

And it seems to work.

``````ghci> search (\x -> x + 1 == 2)
Just (Succ Zero)
ghci> search (\x -> x*x == 16)
Just (Succ (Succ (Succ (Succ Zero))))
``````

Er, almost.

``````ghci> search (\x -> x*x == 15)
(infinite loop)
``````

We want it to return `Nothing` in this last case. It’s no surprise that it didn’t — there is no condition under which `search` is capable of returning `Nothing`, that definition would pass if `Maybe` were defined `data Maybe a = Just a`.

It is not at all clear that it is even possible to get what we want. But one of Escardó’s insights showed that it is: make a variant of `search` that is allowed to lie.

``````-- lyingSearch f returns a Nat n such that f n, but if there is none, then
-- it returns a Nat anyway.
lyingSearch :: (Nat -> Bool) -> Nat
lyingSearch f | f Zero = Zero
| otherwise = Succ (lyingSearch (f . Succ))
``````

And then we can define our regular `search` in terms of it:

``````search' f | f possibleMatch = Just possibleMatch
| otherwise       = Nothing
where
possibleMatch = lyingSearch f
``````

Let’s try.

``````ghci> search' (\x -> x*x == 16)   -- as before
Just (Succ (Succ (Succ (Succ Zero))))
ghci> search' (\x -> x*x == 15)
Nothing
``````

Woah! How the heck did it know that? Let’s see what happened.

``````let f = \x -> x*x == 15
lyingSearch f
0*0 /= 15
Succ (lyingSearch (f . Succ))
1*1 /= 15
Succ (Succ (lyingSearch (f . Succ . Succ)))
2*2 /= 15
...
``````

That condition is never going to pass, so `lyingSearch` going to keep taking the `Succ` branch forever, thus returning `infinity`. Inspection of the definition of `*` reveals that `infinity * infinity = infinity`, and `infinity` differs from `15` once you peel off 15 `Succ`s, thus `f infinity = False`.

With this example in mind, the correctness of `search'` is fairly apparent. Exercise for the readers who are smarter than me: prove it formally.

Since a proper `Maybe`-returning search is trivial to construct given one of these lying functions, the question becomes: for which data types can we implement a lying search function? It is a challenging but approachable exercise to show that it can be done for every recursive polynomial type, and I recommend it to anyone interested in the subject.

Hint: begin with a `Search` data type:

``````newtype Search a = S ((a -> Bool) -> a)
``````

Implement its Functor instance, and then implement the following combinators:

``````searchUnit :: Search ()
searchEither :: Search a -> Search b -> Search (Either a b)
searchPair :: Search a -> Search b -> Search (a,b)
newtype Nu f = Roll { unroll :: f (Nu f) }
searchNu :: (forall a. Search a -> Search (f a)) -> Search (Nu f)
``````

More Hint: is `searchPair` giving you trouble? To construct `(a,b)`, first find `a` such that there exists a `y` that makes `(a,y)` match the predicate. Then construct `b` using your newly found `a`.

The aforementioned Cantor space is a recursive polynomial type, so we already have it’s search function.

``````type Cantor = Nu ((,) Bool)
searchCantor = searchNu (searchPair searchBool)

ghci> take 30 . show \$ searchCantor (not . fst . unroll . snd . unroll)
"(True,(False,(True,(True,(True"
``````

We can’t expect to construct a reasonable `Search Integer`. We could encode in the bits of an `Integer` the execution trace of a Turing machine, as in the proof of the undecidability of the Post correspondence problem. We could write a total function `validTrace :: Integer -> Bool` that returns `True` if and only if the given integer represents a valid trace that ends in a halting state. And we could also write a function `initialState :: Integer -> MachineState` that extracts the first state of the machine. Then the function `\machine -> searchInteger (\x -> initialState x == machine && validTrace x)` would solve the halting problem.

The reason this argument doesn’t work for `Nat` is because the `validTrace` function would loop on `infinity`, thus violating the precondition that our predicates must be total.

I hear the following question begging to be explored next: are there any searchable types that are not recursive polynomials?

## 14 thoughts on “Searchable Data Types”

1. Can you explain what you mean by the last question? Martin Escardo characterized searchable datatypes. For example, a retract of a searchable datatype is searchable. Are “recursive polynomials” closed under retractions?

2. It looks like the key point is that the predicate must terminate on `infinity`, which means that it can only inspect a bounded amount of `Succ`s before terminating. Fascinating!

3. Thanks Heinrich Apfelmus for answering my question before I asked it. After seeing this I thought “but that can’t always work”, and it didn’t take long to find a counterexample: it runs forever trying to find a solution to x == x + 1.

I think the right insight to take away from the failure, though, is not that the predicate must terminate for infinity, but simply that predicate must always terminate. Luke expanded the type of integers to its minimal compactification, and in doing so, added a new element. This turns certain predicates from always-terminating into not-always-terminating.

4. And, I meant to add, if the predicate is not guaranteed to terminate, then it is unreasonable to expect the search to be guaranteed to terminate!

5. Job Vranish says:

Just to clarify, it looks like that if the predicate is total, and we are able to construct a valid lyingSearch function, that our search will always terminate? Is this true?

6. Invariably your blog seems to mangle any comment where I try to paste code, so I’ve given my reply with an implementation on reddit:

You can exploit Hinze style generics to get implementations of lyingSearch for recursive polynomial data types for free, as long as you have the instance.

Second, you can search for (some) functions using this mechanism by tabulating them and searching for appropriate tabulations (by exploiting side-effects you can limit the amount of data you need to tabulate). Since every search predicate has to terminate for all inputs, it can only inspect the result of applying the function to a finite number of inputs.

7. Dan says:

I don’t understand why you can’t implement the Mu operator and get all the mu-recursive functions this way, but maintain totality? Something is not getting to me.

8. Luke says:

@Andrej, I meant begging to be explored by me. I realize Escardo has done a lot in this area, but I don’t understand the more mathematically technical parts of his work. My first exposure to retracts was looking them up on Wikipedia when you mentioned it :-/

@Job, well yeah. If lyingSearch p returns *anything* (with no bottoms I mean), then p (lyingSearch p) will terminate, because p is total.

@Edward, I didn’t know about that mechanism for instantiating generics. Cool. Your function instance requires the domain to have decidable equality, so it wouldn’t work on eg. Cantor, which is not discrete. I wonder if that can be relaxed further.

@Dan, your question is getting at something I’m still trying to understand completely. The trouble here is that the lyingSearch in the article for Nat wouldn’t work for Mu Maybe, because Mu Maybe doesn’t have an infinite element. lyingSearch (const False) would give bottom instead of infinity. But that is not an entirely satisfying explanation. It has something to do with compactness.

9. Why doesn’t Mu Maybe have infiniteMaybe = Just infiniteMaybe?

10. Dan: because you are producing the value — effectively with an anamorphism — so you are producing a value in Nu, not Mu. If your predicate never succeeds, you never reach the base case. The reasoning that permits this is inherently coinductive.

11. Luke says:

@Mark, Mu is the least fixed point, whereas Nu is the greatest. Mu Maybe = the least set such that 1+X = X, i.e. the naturals, which has no infinity. The difference between Mu and Nu comes down to strictness: data MuMaybe = Zero | Succ !MuMaybe, whereas data NuMaybe = Zero | Succ NuMaybe.

I’ve been trying to prove that, but I can’t remember what foundation I need.

12. Pardon me for bringing up an old post, but how do I define search over naturals using the infinite search monad? I tried this:

natural :: Set Nat
natural = foldr1 union \$ map (singleton) \$ iterate S Z

where data Nat = Z | S Nat. The monad search loops on (\x -> x*2 == 1), whereas doing it with search’ correctly returns Nothing. Both searches correctly return nothing on (\x -> x+2 == 1).

13. Luke says:

@Dan, your `natural` describes the set {Z, SZ, SSZ, SSSZ, …}, which is not searchable — it is missing SSSS… (aka `fix S`). The recursive definition `natural = union (singleton Z) (S <\$> natural)` should work. (So recursive definitions of Sets apparently do *not* correspond to the least fixed point in the subset sense, otherwise these definitions should be the same)