# What was the probability of that?

A group of college students bask in their crude weekly haven of Texas Hold’em poker. Amid the usual banter, they watch as one deals out the flop: a three of spades, a three of diamonds, and a three of clubs. The room falls silent and tensions are suspended as if the ground had fallen out from beneath the house. The highest card may win this hand, a pocket pair is almost golden, and everyone wonders if someone has the single remaining three. A round of timid betting ensues — a wrong step here could be very expensive. After the round completes, the fourth card is dealt onto the table: the three of hearts. The tensions are dropped as everyone laughs incredulously, and out of the laughter emerges “what’s the probability of that?”

One of the more mathematically adept players pulls out his phone and does a quick calculation: $\frac{4}{52}\frac{3}{51}\frac{2}{50}\frac{1}{49}$ — about 1 in 270,000. Everyone is wowed by the rarity of the event they just witnessed.

That is indeed the correct probability to get four consecutive threes from a deck of cards. But is that really the question here? Surely nearly the same response would have occurred if it had been four fours, or four nines. If it were four aces people would be even more astonished. The same response would have occurred if the cards had been four to a straight flush; e.g. the five through eight of spades. There are many such situations. “Four threes” is the most immediate pattern that we recognize as anomalous, but it is not the only anomalous pattern.

So what event is really being referred to by that? Those specific four cards had a 1 in 6.5 million chance of coming up in the order they did from a player’s perspective before the hand, and a 100% chance of coming up in the order they did from a perspective after the hand [some will note that I am using a Bayesian interpretation of probability at this point]. The probability of the specific real world event that occurred (including the orientations and temperatures of the cards, the reactions of the players, and the taste of Jake Danser’s bite of Chinese dinner 500 miles away), from the perspective of any of the players, is unbelievably small.

In situations where this question is asked, I always jokingly answer the question with “1″. Most of the time people laugh and let the question rest, but sometimes the conversation turns more serious. In this case I try (in all cases so far, in vain) to explain the plurality of perspective at play here. The “serious” answer I have for this question, doing my best to interpret the intended meaning of the statement while incorporating these issues, is a number I cannot calculate but that I can describe: it is the probability that the speaker would have been astonished by the event that occurred; essentially the probability that he would remark “what’s the probability of that?”. I think that is quite a bit higher than one in 270,000, so the event we witnessed was not as rare as the simple calculation would have us believe.

The dissonance of such situations points to a common simplistic view of probability: that events (in the colloquial sense) have probabilities. A distinction that is commonly understood (and commonly misunderstood) is that between frequentism, which talks about running an experiment a bunch of times and calculating the proportion of experiments that satisfied some criterion, and Bayesianism, which talks about the confidence one has in some property being true in terms of a subjective state of knowledge. This is a fascinatingly subtle distinction: they coincide on most but not all questions, and there is much argument about which one is “right” (I think that is a rather silly argument, as if a probability were something bestowed to us from nature). However, both of these views are calculating the probability of a property (a set of events) being true (happening), and both of these views rely on an assumption of prior information: the frequentists on the set-up of the experiment and the Bayesians on their prior state of knowledge about the world. The idea that an event has a single, objective probability says something very strong about the universe: that there is some essential, natural source of randomness (to dispel any quantum objections, I point to E.T. Jaynes’s critique of the view that the randomness in quantum theory is an essential property of nature). But even if there were some essential source of randomness, surely the universe is more determined than our probabilistic calculations assume: from the view of this universe, the deck cannot be uniformly shuffled after only seven shuffles because it knows how your hands and your brain work. So we were never talking about an essential, natural randomness.

In our normal use of probabilities, we don’t run into this problem because we are predicting the future: e.g. “what is the probability that Obama will be re-elected?”. We conceive of this as a single event, but it is actually a wide collection of events. Given some prior knowledge, this set of events has a definite probability. But we are inconsistent about how we treat future events and past events: past events are not collections — they happened, and they happened in exactly one way. From the perspective of now, the probability that it would happen any other way is zero. From the perspective of before the event, we are asking whether “it” would happen or not, but we are entirely unclear about what measurable property we mean by “it”. We mean something different when we refer to events in the future and events in the past.

In summary, probabilities are functions of (1) prior information and (2) a criterion for judging when it is to be considered true. Probability is meaningless to apply to a single event.

N.B. in probability theory, the word event is already defined to mean a set of outcomes. If you read this article with that definition in mind, you will have been very confused :-).

If this post helped you think more clearly about probability, show your appreciation and .

As our understanding progresses, what once were rigid distinctions tend to become blurred. Hence, I am fascinated by the pervasiveness and stability of the three programming language paradigms: imperative, functional (here I mean that word in the pure sense, not HOFs sense — Conal Elliott would probably have me call it denotational), and logical. Whole languages are beginning to blur their orientation, but specific solutions to problems are still typically classifiable along these lines nonetheless. We will see how the three are linked to forms human communication and the style of our discourse — and thus why these distinctions are so stable.

Each paradigm corresponds to a syntactic unit of language. The imperative, as the name suggests, corresponds to sentences in the imperative mood:

def bfs(predicate, root):
queue = Queue()
while not queue.empty():    # While the queue is not empty, do the following:
node = queue.dequeue()    #   Take a node off the queue.
if predicate(node):       #   If it satisfies the predictate,
return node           #     return it.
for n in node.children(): #   Otherwise,


The program is written as a recipe, telling the computer what to do as if a servant. Note that, after qualifying phrases, each sentence begins with an action word in the imperative form, just as this sentence begins with “note”. The object-oriented aspect adds a sort of directorial role to the program, wherein the program is read not as the user telling the computer what to do, but the program telling objects in the program what to do. Sentences are still written in the imperative mood, they can now be directed: “queue, give me an element”, “handle, write down this line.”

But not every sentence in human discourse is imperative, for example this one. The logical captures sentences of relationship, such as:

captures(logical, relationships).


But perhaps we should see a more practical logical program as an example:

mirrors([], []).                  % The empty list mirrors itself.
mirrors([X|XS], L) :-             % [X|XS] mirrors L if
append(LS,[X],L),            %    LS and [X] compose L, and
mirrors(XS, LS).             %    XS mirrors LS.


The program is written as a set of assertions, building a model of its world up from nothing. To run this program, you ask it questions about the world:

?- mirrors(X, [1,2,3]).           % What mirrors [1,2,3]?
X = [3,2,1]


A great deal of human discourse falls into this category: propositional sentences. Most of the sentences in this post fall into that category. However, those familiar with Prolog will know that this is a poor implementation of mirrors (it is quadratic time), and would write it this way instead:

mirror([], R, R).         % The mirror of [] prepended to R is R
mirror([X|XS], Y, R) :-   % The mirror of [X|XS] prepended to Y is
mirror(XS, [X|Y], R). %   the mirror of XS prepended to [X|Y|.


In proper logical style, mirror expresses itself as propositional relationships, with the caveat that the only relationship is "is". Code written this way, relating things by identity rather than propositionally, is actually characteristic of functional style:

mirror [] r = r           -- The mirror of [] prepended to r is r
mirror (x:xs) y =         -- The mirror of (x:xs) prepended to y is
mirror xs (x:y)       --   the mirror of xs prepended to (x:y)


The R in the Prolog code is only connecting together propositions so that we can express this idea in a functional way. The original mirrors predicate is quite unlike this; expressing it in Haskell requires more than a mindless transliteration (however there are still hints of a structural similarity, if a bit elusive).

But I claim that “is” is not the defining linguistic characteristic of functional programs. We could write a functional program with a single equals sign if we were so obfuscatedly-minded; the analogous claim is invalid for logical and imperative programs. The characteristic device of functional programming is the noun: functional programs do not give instructions or express relationships, they are interested in defining objects.

bfs children xs =               -- the BF traversal of xs is
xs ++                       -- xs itself appended to
bfs (concatMap children xs) -- the BF traversal of each of xs's children


The monad is typically used to express instructions as nouns:

main =                            -- the main program is the recipe which
getLine >>= \name ->          -- gets a line then
putStrLn \$ "Hello, " ++ name  -- puts "Hello, " prepended to that line


Haskell elites will object to me using IO as a prototypical example of a monad, but the claim is still valid; look at the words used to define monad actions: tell, ask, get, put, call. These are imperative words. This is not a sweeping generalization, however; for example, the constructors of Escardó’s search monad are nouns.

The following table summarizes the linguistic analogies:

 Paradigm Mechanism Example Imperative Imperative mood Put a piece of bread; put meat on top; put bread on top. Logical Propositional relationships Bread sandwiches meat. Functional Noun phrases Meat sandwiched by bread

Each of these paradigms is pretty good in its area. When we’re issuing commands in a functional language, we pretend we are using an imperative language; when we want to treat complex nouns (such as lists) in an imperative language, we are learning fall back on functional concepts to operate on them. When we are documenting our code or talking about types, we use logical ideas (typeclasses, generics).

The thing that makes me hubristically curious is the realization that these three categories hardly cover the mechanisms available in language. What else can we learn from human languages when expressing ourselves to computers?

This post has focused on the “content code” of the various paradigms. There is another kind of code (mostly in statically typed languages) that is interspersed with this, namely declarations:

class Player {...}
data Player = ...


I have not yet been able to find a good analogy for declarations in language; perhaps they introduce something like a proper noun. I will save this topic for another post.

# Relativism and Language

It is hard for me to imagine that so many people are so wrong. Sure, core beliefs go unexamined. Yes, we often unconsciously repeat taglines we have heard from those we respect instead of attempting to translate our true views. But I must admit I think of all people as essentially wanting to figure it out. Life, the universe, their meaning, how to make the world a better place. Some, who see the world as a competitive, dog-eat-dog place, want to figure it out because it will help them survive. Others, like me, who see the modern (Western — that is all I have direct experience with) world as an essentially benign place, just want to figure it out because of a innate curiosity (no doubt a result of past generations with the former motivation).

So when someone says something which strikes me as wrong, when I have the kneejerking impulse to correct them, this belief of mine kicks in and stops me. Oh my, it didn’t used to; I would happily correct the abundant wrongness in the world. After all, if people think the right way, they will do better for themselves and others. I can’t remember a time when I didn’t have this belief, however, but it has taken a while to trickle its way into my choice of actions.

All through my youth, I was told that I was smart (a pedagogically questionable practice). I didn’t buy it (I’ve always had a rebellious streak). What makes me so special? I wasn’t just born with smartness, I thought. At first this manifested as an individualistic self-righteousness: I must be smart because of the intelligent ways I chose to spend my youth (what? Trampoline, video games, and Power Rangers?). More recently it has manifested as a skepticism of the views of those who tell me I am smart: you only say that because I am articulating things you agree with, so the compliment is a way of affirming your own worldview. Those both seem naive to me now. I don’t know what I currently think about it, I will probably only be able to articulate that once I move to some other view.

I am still skeptical of any innate superiority (however not enough so to avoid writing this post in a way that comes across as advice). So when I stop myself from correcting a wrongness, what do I do? This is the relativism I’ve been talking about.

Words don’t have meaning; in a conversation, the speaker translates meaning into words, and then the listener translates the words into meaning. We have a soft social agreement about how words are used, and that gives rise to trends in our patterns of thought. But the possibility remains — and I use the word possibility only because of a timidness, I really think of it more as a high probability — that the meanings that I have assigned to the words when I hear them are different from the meanings that were used to form them. Indeed, it is unclear what is even meant by two people having the same thought. My brain is not likely to have the ability to represent the thought that produced the words, especially if I disagree with them.

The exercise, then, is this: try to represent those thoughts anyway. How can I think of these words so that the sentence becomes true? Not just slightly less false, but really true. I might have to temporarily reorient my value system; I might have to imagine I grew up in a religious family; I might have to picture the scary possible worlds that might result if the statement were false (that is, beyond the proportion of the consequences I actually predict, already thinking the statement is false). When I remember to do this, I am brought to a calm, understanding level, with few fiery arguments in sight. My contributions to these conversations are transformed into questions instead of assertions — not Socratic “let me lead you to the right answer” questions, but genuine “I want to understand you” questions.

And that is the essence of relativism to me. What you mean by your words is not what I mean by your words. Sentences are uttered with the concept of their truth in mind, and before blasting forth a correction, I first have to understand how they are true. And more often than not, my planned correction is dismantled and replaced by a connected friendship.

# ∃

So many philosophical pseudo-debates focus on the existence or non-existence of this or that “thing”. Pop-skepticism is at odds with most sects of Christianity about the existence of a God; many skeptics, somehow oblivious of the hypocrisy in which they engage, argue simultaneously that claims must be supported with evidence and that there must be no God. I engaged fleetingly with the university’s skeptics society in a debate about the existence of the electron, in which I argued that the electron was a mathematical tool that enjoyed the same level of existence as a number, and not so much existence as…

Fortunately for me, I did not have to complete the above thought, as the debate was on facebook so I was permitted not to respond once the level of abstraction exceeded me. Rather than the inevitable fate of a face-to-face debate on the subject — in which I would make a fool of myself for failing to possess a well-collected, self-consistent argument, my opponents permitting me to exit the arena now having failed to disrupt their conceptual status quo — the debate fizzled out, and they will probably not remember of their own volition that they had even engaged in it. It is all for the better that my medium has changed, since after some time spent meditating on the question, I have come across something I have not been able to distill into a snappy epigram.

To a “standard model” logical mind, and even to the working mathematician who has not studied logic, existence is a straightforward concept. One can ask whether a mathematical object exists with some property, and assume without argument that one is asking a reasonable question with a yes-or-no answer. However, in the world of mathematical logic — the only logical world whose paradoxes I can comfortably resolve — the notion of existence is rather more slippery. There are the standard objects which one can prove to exist from the axioms, and there are — or perhaps I should say, there are not — objects whose existence is contradictory. But there is a neglected middle class. These objects _____ whether or not you choose to exclude the middle.

The Twin Prime Conjecture (TPC), a famous question still open today in 2011, conjectures that there are infinitely many numbers p such that both p and p+2 are prime. One of these pairs is called a “twin prime”, for example 5 and 7, or 179 and 181. There are many who believe TPC is true, some who believe TPC is false, but among logicians (who crave this sort of result), many believe TPC is “independent of the axioms.” Let us explore the consequences of this latter belief. To be concrete (insofar as such a word can mean anything in such matters), let us suppose that TPC is independent of “ZFC”, the Zermelo Frankel axioms with the Axiom of Choice, the axioms of choice (no pun intended) for popular set theory.

It would be helpful to be reminded of what exactly ZFC is. Aside from the deep fantastic worlds of intuition inhabiting many mathematicians’ minds, it is merely a set of 9 statements about the world of sets. For example, “if two sets have the same members, then they are the same set”, and “given any set, you may form the subset of elements satisfying a particular property”. These are stated in rigorous, precise logical language, so by formal manipulation we can exclude the subtleties of meaning that would abound in any English presentation of these axioms. Logicians like to say that a proof is nothing more than a chain of formal logical sentences arranged according to some simple rules; this view has spread since the advent of programming languages and computerized mathematical assistants.

If TPC were true, then given any number, you could count up from that number and eventually reach a twin prime. If TPC were false, then there would be some number, call it L, above which it would not be possible to find any twin primes. However, since TPC is independent (because we have supposed it), then we know we cannot prove it either way. It may be true, or it may be false; whether there is a third option is too deep a philosophical question to explore here. We may be able to count up from any number and find a twin prime, but we will never be sure that we will not arrive at a point after which there are no more. Or there may in fact be an L above which there are no more, but we shall never be able to write L as a sequence of digits. Again, whether these two comprise all possibilities is not a matter capable of absolute resolution.

There can be no proof that L exists, so, like God to the skeptics, it must not exist. By their own standard, this conclusion is not justified, for, by our assumption, there is no evidence in favor of its nonexistence either. Indeed, we may safely believe in L; if a contradiction would arise from its use, then we could leverage that contradiction to provide a proof that there are infinitely many twin primes, thus TPC would have been provable. After centuries of cautious hypothesis of what would happen if L did exist, we may begin to treat L as any other number. As the ancient Greeks’ unease about the existence of irrational numbers has faded, so too would ours. The naturals would become: 1, 2, 3, 4, 5, … L, L+1, …. We will have answered questions about L, for example it is greater than one million, because have found twin primes greater than one million.

This all happens consistently with the proof that the set of natural numbers is made up of only the numbers 1, 2, 3, 4, 5, …, for that proof does not mean what we think it means. We cannot enumerate all the natural numbers in a theorem; that proof only states that the set of natural numbers is the smallest set made up of zero and successors of elements in that set. If we can actually find a twin prime above any number, but merely not know it, then we might claim L cannot be the successor of any element in this set. But this claim is false, because L is clearly the successor of L-1! L, whether or not or ___ it is one of the familiar numbers, manages to sneak its way into the smallest set containing zero and successors. It is not the set of numbers, but the language about numbers that can be extended by this independence of TPC, and L is not logically distinguishable from “regular” numbers. It is a symbolic phenomenon. But so, too, are the familiar numbers. The only difference is we have chosen to say that zero exists.

# I accidentally the verbs

The deep structure of language. Actions, volition; identity, unification. Not realities, but means of understanding the world. Nonsense of the truth of a statement from a determinist perspective. Agent action: volition. Agent action because: post-justification. This is that: the identity fallacy. Two different words; why not this is this or that is that?

A series of images related only implicitly. Richness of language, constraints of mathematics. No verbs, no propositional meaning. Unpresentable relationships for human meaning-makers. An intermittent failure of inserting verbs above. A question of the meaning of verbless sentences. VP-less sentences, to be precise (but still without propositional meaning).

Future eschewing of translations into propositions. A case for the mind to let go of the propositional obsession. The blurring associations between thoughts: associations, judgements, perception blocks. An underappreciated reverence of thoughts in their own right. An abstract picture, not a fact. A thought flying free in the mind. A thought bound to realism with a fact. Meditation and spaciousness in the mind. The Sapir-Whorf hypothesis.

Nouns supplanting verbs as the foundational structure. NPs, to be precise. Potentially mismatching impedance between thoughts and nouns. The question of a freer correspondence between language and thought. Some thoughts have verbs. Mindful restriction, experimentation, like E-prime, to open new pathways in the mind. Now, thoughts of neither noun-ness nor verb-ness:

Vacuous.

Apparently not.

Graspingly anxious at the limits of an author’s imagination. Quotelike, fragmentlike. Descriptive, but without referent. As colors are to a blind man. Thought-Fourier-transform of. Just above decent into syntactic nonsense. Or just before thoughts of pure relationship.

of a normal verb-like sentence with

to the order of

beyond

That brings us back to nouns. A verbal thought.