We just had the first Categories for the Boulderite meetup, in which a bunch of people who don’t know category theory tried to teach it to each other. Some of the people there had not had very much experience with proofs, so getting “a proof” was hard even though the concepts weren’t very deep. I got the impression that those who had trouble mainly did because they did not yet know the “follow your nose” proof tactic which I learned in my first upper division math class in college. That tactic is so often used that most proofs completely omit it (i.e. assume that the reader is doing it) and skip to when it gets interesting. Having it spelled out for me in that class was very helpful. So here I shall repeat it, mostly for my fellow Categories members.

Decide what to do based on a top-down analysis of the sentence you are trying to prove:

 Shape of Sentence Shape of Proof If P, then Q. (aka. P implies Q) Suppose P. P if and only if Q (→) . (←) For all x such that C(x), Q Given x. Suppose C(x). There exists x such that Q. Let x = (requires imagination). P or Q Either or (or sometimes something tricksier like assume not P, ) P and Q (1) . (2) . not P Assume P. (requires imagination) X = Y Reduce X and Y by known equalities one step at a time (whichever side is easier first). Or sometimes there are definitions / lemmas that reduce equality to something else. Something really obvious (like X = X, or 0 ≤ n where n is a natural, etc.) Say “obvious” or “trivial” and you’re done. Something else Find definition or lemma, substitute it in, continue.

Along the way, you will find that you need to use the things you have supposed. So there is another table for how you can use assumptions.

 Shape of assumption Standard usage If P, then Q (aka P implies Q) Prove P. Then you get to use Q. P if and only if Q P and Q are equivalent. Prove one, you get the other. For all x such that C(x), P(x) Prove C(y) for some y that you have, then you get to use P(y). There exists x such that C(x) Use x and the fact that C(x) somehow (helpful, right? ;-). P and Q Therefore P / Therefore Q. P or Q If P then . If Q then . (Or sometimes prove not P, then you know Q) not P Prove P. Then you’re done! (You have inconsistent assumptions, from which anything follows) X = Y If you are stuck and have an X somewhere in your goal, try substituting Y. And vice versa. Something obvious from your other assumptions. Throw it away, it doesn’t help you. Something else Find definition, substitute it in, continue.

Let’s try some examples. First some definitions/lemmas to work with:

Definition (extensionality): If X and Y are sets, then X = Y if and only if for all x, $x \in X$ if and only if $x \in Y$.
Definition: $X \subseteq Y$ if and only if for every a, $a \in X$ implies $a \in Y$.

Theorem: X = Y if and only if $X \subseteq Y$ and $Y \subseteq X$.

• (→) Show X = Y implies $X \subseteq Y$ and $Y \subseteq X$.
• Assume X = Y. Show $X \subseteq Y$ and $Y \subseteq X$.
• Substitute: Show $X \subseteq X$ and $X \subseteq X$.
• We’re done.
• (←) Show $X \subseteq Y$ and $Y \subseteq X$ implies $X = Y$.
• Assume $X \subseteq Y$ and $Y \subseteq X$. Show $X = Y$.
• (expand definition of = by extensionality)
• Show forall x, $x \in X$ if and only if $x \in Y$.
• Given x.
• (→) Show $x \in X$ implies $x \in Y$.
• Follows from the definition of our assumption $X \subseteq Y$.
• (←) Show $x \in Y$ implies $x \in X$.
• Follows from the definition of our assumption $Y \subseteq X$.

See how we are mechanically disassembling the statement we have to prove? Most proofs like this don’t take any deep insight, you just execute this algorithm. Such a process is assumed when reading and writing proofs, so in the real world you will see something more like the following proof:

Proof. (→) trivial. (←) By extensionality, $x \in X$ implies $x \in Y$ since $X \subseteq Y$, and $x \in Y$ implies $x \in X$ since $Y \subseteq X$.

We have left out saying that we are assuming things that you would naturally assume from the follow your nose proof. We have also left out the unfolding of definitions, except perhaps saying the name of the definition. But when just getting started proving things, it’s good to write out these steps in detail, because then you can see what you have to work with and where you are going. Then begin leaving out obvious steps as you become comfortable.

We have also just justified a typical way to show that two sets are equal: show that they are subsets of each other.

Let’s see one more example:

Definition: Given sets A and B, a function f : A → B is a surjection if for every $y \in B$, there exists an $x \in A$ such that f(x) = y.

Definition: Two functions f,g : A → B are equal if and only if for all $x \in A$, f(x) = g(x).

Definition: $(g \circ f)(x) = g(f(x))$.

Definition: For any set $A$, the identity $\mathit{Id}_A$ is defined by $\mathit{Id}_A(x) = x$.

Theorem: Given f : A → B. If there exists f-1 : B → A such that $f \circ f^{-1} = \mathit{Id}_B$, then f is a surjection.

• Given f : A → B.
• Suppose there exists f-1 : B → A and $f \circ f^{-1} = \mathit{Id}_B$. Show f is a surjection.
• By definition, show that for all $y \in B$, there exists $x \in A$ such that $f(x) = y$.
• Given $y \in B$. Show there exists $x \in A$ such that $f(x) = y$.
• Now we have to find an x in A. Well, we have $y \in B$ and a function from B to A, let’s try that:
• Let $x = f^{-1}(y)$. Show $f(x) = y$.
• Substitute: Show $f(f^{-1}(y)) = y$.
• We know $f \circ f^{-1} = \mathit{Id}_B$, so by the definition of two functions being equal, we know $f(f^{-1}(y)) = \mathit{Id}_B(y) = y$, and we’re done.

Again, notice how we are breaking up the task based on the structure of what we are trying to prove. The only non-mechanical things we did were to find x and apply the assumption that $f \circ f^{-1} = \mathit{Id}_B$. In fact, usually the interesting parts of a proof are giving values to “there exists” statements and using assumptions (in particular, saying what values you use “for all” assumptions with). Since those are the interesting parts, those are the only parts that an idiomatic proof would say:

Proof. Given $y \in B$. Let $x = f^{-1}(y)$. $f(x) = f(f^{-1}(y)) = y$ since $f \circ f^{-1} = \mathit{Id}_A$.

Remember to take it step-by-step; at each step, write down what you learned and what you are trying to prove, and try to make a little progress. These proofs are easy if you follow your nose.