Searchable Data Types

A few years ago, Martín Escardó wrote an article about a seemingly-impossible program that can exhaustively search the uncountably infinite "Cantor space" (infinite streams of bits). He then showed that spaces that can be thus searched form a monad (which I threw onto hackage), and wrote a paper about the mathematical foundations which is seemingly impossible to understand.

Anyway, I thought I would give a different perspective on what is going on here. This is a more operational account, however some of the underlying topology might peek out for a few seconds. I’m no topologist, so it will be brief and possibly incorrect.

Let’s start with a simpler infinite space to search. The lazy naturals, or the one-point compactification of the naturals:

data Nat = Zero | Succ Nat
    deriving (Eq,Ord,Show)
infinity = Succ infinity

Let’s partially instantiate Num just so we have something to play with.

instance Num Nat where
    Zero   + y = y
    Succ x + y = Succ (x + y)
    Zero   * y = Zero
    Succ x * y = y + (x * y)
    fromInteger 0 = Zero
    fromInteger n = Succ (fromInteger (n-1))

We wish to construct this function:

search :: (Nat -> Bool) -> Maybe Nat

Which returns a Nat satisfying the criterion if there is one, otherwise Nothing. We assume that the given criterion is total, that is, it always returns, even if it is given infinity. We’re not trying to solve the halting problem. :-)

Let’s try to write this in direct style:

search f | f Zero    = Just Zero
         | otherwise = Succ <$> search (f . Succ)

That is, if the predicate worked for Zero, then Zero is our guy. Otherwise, see if there is an x such that f (Succ x) matches the predicate, and if so, return Succ x. Make sense?

And it seems to work.

ghci> search (\x -> x + 1 == 2)
Just (Succ Zero)
ghci> search (\x -> x*x == 16)
Just (Succ (Succ (Succ (Succ Zero))))

Er, almost.

ghci> search (\x -> x*x == 15)
(infinite loop)

We want it to return Nothing in this last case. It’s no surprise that it didn’t — there is no condition under which search is capable of returning Nothing, that definition would pass if Maybe were defined data Maybe a = Just a.

It is not at all clear that it is even possible to get what we want. But one of Escardó’s insights showed that it is: make a variant of search that is allowed to lie.

-- lyingSearch f returns a Nat n such that f n, but if there is none, then
-- it returns a Nat anyway.
lyingSearch :: (Nat -> Bool) -> Nat
lyingSearch f | f Zero = Zero
              | otherwise = Succ (lyingSearch (f . Succ))

And then we can define our regular search in terms of it:

search' f | f possibleMatch = Just possibleMatch
          | otherwise       = Nothing
    possibleMatch = lyingSearch f

Let’s try.

ghci> search' (\x -> x*x == 16)   -- as before
Just (Succ (Succ (Succ (Succ Zero))))
ghci> search' (\x -> x*x == 15)

Woah! How the heck did it know that? Let’s see what happened.

let f = \x -> x*x == 15
lyingSearch f
0*0 /= 15
Succ (lyingSearch (f . Succ))
1*1 /= 15
Succ (Succ (lyingSearch (f . Succ . Succ)))
2*2 /= 15

That condition is never going to pass, so lyingSearch going to keep taking the Succ branch forever, thus returning infinity. Inspection of the definition of * reveals that infinity * infinity = infinity, and infinity differs from 15 once you peel off 15 Succs, thus f infinity = False.

With this example in mind, the correctness of search' is fairly apparent. Exercise for the readers who are smarter than me: prove it formally.

Since a proper Maybe-returning search is trivial to construct given one of these lying functions, the question becomes: for which data types can we implement a lying search function? It is a challenging but approachable exercise to show that it can be done for every recursive polynomial type, and I recommend it to anyone interested in the subject.

Hint: begin with a Search data type:

newtype Search a = S ((a -> Bool) -> a)

Implement its Functor instance, and then implement the following combinators:

searchUnit :: Search ()
searchEither :: Search a -> Search b -> Search (Either a b)
searchPair :: Search a -> Search b -> Search (a,b)
newtype Nu f = Roll { unroll :: f (Nu f) }
searchNu :: (forall a. Search a -> Search (f a)) -> Search (Nu f)

More Hint: is searchPair giving you trouble? To construct (a,b), first find a such that there exists a y that makes (a,y) match the predicate. Then construct b using your newly found a.

The aforementioned Cantor space is a recursive polynomial type, so we already have it’s search function.

type Cantor = Nu ((,) Bool)
searchCantor = searchNu (searchPair searchBool)

ghci> take 30 . show $ searchCantor (not . fst . unroll . snd . unroll)

We can’t expect to construct a reasonable Search Integer. We could encode in the bits of an Integer the execution trace of a Turing machine, as in the proof of the undecidability of the Post correspondence problem. We could write a total function validTrace :: Integer -> Bool that returns True if and only if the given integer represents a valid trace that ends in a halting state. And we could also write a function initialState :: Integer -> MachineState that extracts the first state of the machine. Then the function \machine -> searchInteger (\x -> initialState x == machine && validTrace x) would solve the halting problem.

The reason this argument doesn’t work for Nat is because the validTrace function would loop on infinity, thus violating the precondition that our predicates must be total.

I hear the following question begging to be explored next: are there any searchable types that are not recursive polynomials?

Did you like this post? Flattr this

15 thoughts on “Searchable Data Types

  1. Can you explain what you mean by the last question? Martin Escardo characterized searchable datatypes. For example, a retract of a searchable datatype is searchable. Are “recursive polynomials” closed under retractions?

  2. It looks like the key point is that the predicate must terminate on infinity, which means that it can only inspect a bounded amount of Succs before terminating. Fascinating!

  3. Thanks Heinrich Apfelmus for answering my question before I asked it. After seeing this I thought “but that can’t always work”, and it didn’t take long to find a counterexample: it runs forever trying to find a solution to x == x + 1.

    I think the right insight to take away from the failure, though, is not that the predicate must terminate for infinity, but simply that predicate must always terminate. Luke expanded the type of integers to its minimal compactification, and in doing so, added a new element. This turns certain predicates from always-terminating into not-always-terminating.

  4. Just to clarify, it looks like that if the predicate is total, and we are able to construct a valid lyingSearch function, that our search will always terminate? Is this true?

  5. Invariably your blog seems to mangle any comment where I try to paste code, so I’ve given my reply with an implementation on reddit:

    You can exploit Hinze style generics to get implementations of lyingSearch for recursive polynomial data types for free, as long as you have the instance.

    Second, you can search for (some) functions using this mechanism by tabulating them and searching for appropriate tabulations (by exploiting side-effects you can limit the amount of data you need to tabulate). Since every search predicate has to terminate for all inputs, it can only inspect the result of applying the function to a finite number of inputs.

  6. I don’t understand why you can’t implement the Mu operator and get all the mu-recursive functions this way, but maintain totality? Something is not getting to me.

  7. @Andrej, I meant begging to be explored by me. I realize Escardo has done a lot in this area, but I don’t understand the more mathematically technical parts of his work. My first exposure to retracts was looking them up on Wikipedia when you mentioned it :-/

    @Job, well yeah. If lyingSearch p returns *anything* (with no bottoms I mean), then p (lyingSearch p) will terminate, because p is total.

    @Edward, I didn’t know about that mechanism for instantiating generics. Cool. Your function instance requires the domain to have decidable equality, so it wouldn’t work on eg. Cantor, which is not discrete. I wonder if that can be relaxed further.

    @Dan, your question is getting at something I’m still trying to understand completely. The trouble here is that the lyingSearch in the article for Nat wouldn’t work for Mu Maybe, because Mu Maybe doesn’t have an infinite element. lyingSearch (const False) would give bottom instead of infinity. But that is not an entirely satisfying explanation. It has something to do with compactness.

  8. Dan: because you are producing the value — effectively with an anamorphism — so you are producing a value in Nu, not Mu. If your predicate never succeeds, you never reach the base case. The reasoning that permits this is inherently coinductive.

  9. @Mark, Mu is the least fixed point, whereas Nu is the greatest. Mu Maybe = the least set such that 1+X = X, i.e. the naturals, which has no infinity. The difference between Mu and Nu comes down to strictness: data MuMaybe = Zero | Succ !MuMaybe, whereas data NuMaybe = Zero | Succ NuMaybe.

    I’ve been trying to prove that, but I can’t remember what foundation I need.

  10. Pardon me for bringing up an old post, but how do I define search over naturals using the infinite search monad? I tried this:

    natural :: Set Nat
    natural = foldr1 union $ map (singleton) $ iterate S Z

    where data Nat = Z | S Nat. The monad search loops on (\x -> x*2 == 1), whereas doing it with search’ correctly returns Nothing. Both searches correctly return nothing on (\x -> x+2 == 1).

  11. @Dan, your `natural` describes the set {Z, SZ, SSZ, SSSZ, …}, which is not searchable — it is missing SSSS… (aka `fix S`). The recursive definition `natural = union (singleton Z) (S <$> natural)` should work. (So recursive definitions of Sets apparently do *not* correspond to the least fixed point in the subset sense, otherwise these definitions should be the same)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s