Certificate Design Pattern

When working the latest incarnation of my System IG compiler, I used a thingy which I now realize ought to be characterized as a design pattern. It substantially changed the way I was thinking about the code, which is what makes it interesting.

Summary: separate an algorithm into certificate constructors and a search algorithm.

A large class of algorithms can be considered, in some way, as search algorithms. It is given a problem and searches for a solution to that problem. For example, typically you wouldn’t phrase the quadratic formula as a search algorithm, but it is—it’s just a very smart, fast one. It is given a,b, and c and searches for a solution to the equation ax2 + bx + c = 0.

The certificate design pattern separates the algorithm into two modules: the certificate module and the algorithm. The certificate module provides constructors for solutions to the problem. For each correct solution, it is possible to construct a certificate, and it is impossible to construct a certificate for an incorrect solution. The certificate module for the quadratic formula algorithm might look like this:

module Certificate (Certificate, certify, solution) where

data Certificate = Certificate Double Double Double Double

certify :: Double -> Double -> Double -> Double -> Maybe Certificate
certify a b c x | a*x^2 + b*x + c == 0 = Just (Certificate a b c x)
                | otherwise            = Nothing

solution :: Certificate -> (Double,Double,Double,Double)
solution (Certificate a b c x) = (a,b,c,x)

There is only one way to construct a Certificate, and that is to pass it a solution to the quadratic equation. If it is not actually a solution, a certificate cannot be constructed for it. This module is very easy to verify. The algorithm module is obvious:

module Algorithm (solve) where
import Certificate
import Data.Maybe (fromJust)

solve :: Double -> Double -> Double -> Certificate
solve a b c = fromJust $ certify a b c ((-b + sqrt (b^2 - 4*a*c)) / (2*a))

Here, we use the quadratic formula and construct a certificate of its correctness. If we made a typo in the formula, then certify would return Nothing and we would get an error when we fromJust it (an error is justified in this case, rather than an exception, because we made a mistake when programming — it’s like an assert).

The client to the algorithm gets a certificate back from solve, and can extract its solution. All the information needed to verify that the certificate is a correct certificate for the given problem should be provided. For example, if Certificate had only contained x instead of a,b,c,x, then we could have implemented solve like:

solve a b c = certify 0 0 0 0

Because that is a valid solution, but we have not solved the problem. The client needs to be able to inspect that a,b,c match the input values. Maximally untrusting client code might look like this:

unsafeSolve a b c = 
  let (a',b',c',x) = solution (solve a b c) in assert (a == a' && b == b' && c == c') x
  where
  assert True x = x
  assert False _ = error "Assertion failed"

Here we can give any function whatsoever for solve, and we will never report an incorrect answer (replacing the incorrectness with a runtime error).

This is certainly overkill for this example, but in the System IG compiler it makes a lot of sense. I have a small set of rules which form well-typed programs, and have put in much effort to prove this set of rules is consistent and complete. But I want to experiment with different interfaces, different inference algorithms, different optimizations, etc.

So my Certificate implements combinators for each of the rules in my system, and all the different algorithms plug into that set of rules. So whenever I write a typechecker algorithm, if it finds a solution, the solution is correct by construction. This gives me a lot of freedom to play with different techniques.

Verification rules can be more involved than this single function that constructs a certificate. In the System IG compiler, there are 12 construction rules, most of them taking other certificates as arguments (which would make them certificate “combinators”). I’ll show an example of more complex certificate constructors later.

What is interesting about this pattern, aside from the added correctness and verification guarantees, is that is changed the way I thought while I was implementing the algorithm. Instead of being master of the computer, and telling it what to do, it was more like a puzzle I had to solve. In some ways it was harder, but I attribute that to redistributing the workload; it’s harder because I am forced to write code that is correct from the get-go, instead of accidentally introducing bugs and thinking I’m done.

The other interesting mental change was that it often guided my solution. I would look at the certificate I’m trying to create, and see which constructors could create it. This gave me an idea of the information I was after. This information is the information necessary to convince the client that my solution is correct; I cannot proceed without it.

Theoretically, the algorithm part could be completely generic. It might just do a generic search algorithm like Dijkstra. If it finds a certificate, then it has solved your problem correctly. Solutions for free! (But this will not be practical in most cases — it might not yield a correct algorithm by other criteria, such as “always halts”).

Here’s an example of a more complex certificate. The domain is SK combinator calculus, and a Conversion is a certificate that holds two terms. If a Conversion can be constructed, then the two terms are convertible.

module Conversion ( Term(..), Conversion
                  , convId, convCompose, convFlip
                  , convS, convK, convApp)
where

infixl 9 :*
data Term = S | K | Term :* Term   deriving (Eq)
data Conversion = Term :<-> Term

convTerms (a :<-> b) = (a,b)

convId t = t :<-> t

convCompose (a :<-> b) (b' :<-> c)
    | b == b' = Just $ a :<-> c
    | otherwise = Nothing

convFlip (a :<-> b) = b :<-> a

convS (S :* x :* y :* z) = Just $ (S :* x :* y :* z)  :<->  (x :* z :* (y :* z))
convS _ = Nothing

convK (K :* x :* y) = Just $ (K :* x :* y)  :<->  x
convK _ = Nothing

convApp (a :<-> b) (c :<-> d) = (a :* c) :<->  (b :* d)

The export list is key. If we had exported the (:<->) constructor, then it would be possible to create invalid conversions. The correctness of a certificate module is all about what it doesn’t export.

I’m wondering what the best way to present this as an object-oriented pattern is, so I can insert it into popular CS folklore (assuming it’s not already there ;-).

About these ads

9 thoughts on “Certificate Design Pattern

  1. Hi Luke,
    thanks for your blog posts. It’s always fun to read them.

    The design pattern you’re describing here reminds me of the theorem prover LCF (http://en.wikipedia.org/wiki/LCF_theorem_prover) and its successors Isabelle and HOL Light. All of them use abstract datatypes to manage certified solutions. In their case those solutions are valid theorems. Perhaps you can even get some standard terminology from their accompanying literature.

    With respect to Isabelle a good starting point could be its reference manual (isabelle.in.tum.de/doc/ref.pdf). In section 5.2 you can find the primitive theorem combinators, while in section 6.7 and 6.9, certified terms and types are described.

    Keep up the good work!

  2. Nice post.

    One minor observation:

    Shouldn’t your definition of convApp look like:

    convApp (a : b) (c : d) = (a :* c) : (b :* d)

    or perhaps more clearly, two seperate combinators, to reduce the chance of transposition errors?

    convAppL (a : b) c = (a :* c) : (b :* c)
    convAppR a (b : c) = (a :* b) : (a :* c)

  3. Glad to see I’m not the only one to have stumbled on this technique!

    The idea of using private data constructors to witness useful properties seems too obvious to be novel, yet I can’t recall ever seeing it described, let alone named. Sounds like a good candidate for a Design Pattern.

  4. Nice post! Like all good ideas, it seems obvious in retrospect, but I had never thought of it this clearly and generally before. Your “certificates” are akin to types of the form (a -> b -> c …. -> Prop) in Coq, with the combinators as data constructors. Of course, using something like Coq (or similar systems) gives you some additional guarantees, namely that certificates can be erased before runtime, and that you’ll never be able to get invalid certificates that still typecheck due to nontermination. But I imagine that for most applications (like your examples above) guaranteeing termination is easy enough to do by inspection of the certificate module; in that case you can ‘erase’ in a lazy language by simply never pattern matching on the generated certificates, so you don’t waste memory by actually constructing them at runtime! If everything in your certificate module is terminating, then the fact that your certificate-using code typechecks is good enough.

  5. One of the simplest possible instantiations of this pattern is the NonEmpty list package, which I recently uploaded to hackage. (It could do with some more functions, I admit.)

    Although, instead of just yielding Maybe NonEmpty, I generalise to any monad.

  6. The first thing that springs to mind when I read this is proof carrying code as seen in dependently typed languages such as Agda. That is, dependent data types as the resulting type of applying an algorithm to its inputs that prove their own correctness.

  7. Just as previous commenters have noted, certificates are very common in areas such as theorem proving, proof carrying code etc. But there the notion of certificate is somewhat stronger than what you propose. Certificates are proofs in some formal logic that the solution does indeed solve the problem. Encoding logics in Haskell can of course be problematic since the type system isn’t that strong.
    Another property that is often required of certificates is that they are easy to check, for some definition of easy. And checking a certificate can have many applications depending on the domain, but one is of course to test the algorithm which generates the certificate.

    In dependently typed languages like Agda one can even say that this feature is built into the language but on a different level. Suppose that you write a type which corresponds to some theorem you want to show about your code (via Curry-Howard). Writing a term which has this type is actually to create a certificate that shows that the theorem is true. Generating the term might be very difficult but checking it is as easy as type checking.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s