A group of college students bask in their crude weekly haven of Texas Hold’em poker. Amid the usual banter, they watch as one deals out the flop: a three of spades, a three of diamonds, and a three of clubs. The room falls silent and tensions are suspended as if the ground had fallen out from beneath the house. The highest card may win this hand, a pocket pair is almost golden, and everyone wonders if someone has the single remaining three. A round of timid betting ensues — a wrong step here could be very expensive. After the round completes, the fourth card is dealt onto the table: the three of hearts. The tensions are dropped as everyone laughs incredulously, and out of the laughter emerges “what’s the probability of *that*?”

One of the more mathematically adept players pulls out his phone and does a quick calculation: — about 1 in 270,000. Everyone is wowed by the rarity of the event they just witnessed.

That is indeed the correct probability to get four consecutive threes from a deck of cards. But is that really the question here? Surely nearly the same response would have occurred if it had been four fours, or four nines. If it were four aces people would be even *more* astonished. The same response would have occurred if the cards had been four to a straight flush; e.g. the five through eight of spades. There are many such situations. “Four threes” is the most immediate *pattern* that we recognize as anomalous, but it is not the only anomalous pattern.

So what event is really being referred to by *that*? Those specific four cards had a 1 in 6.5 million chance of coming up in the order they did from a player’s perspective before the hand, and a 100% chance of coming up in the order they did from a perspective after the hand [some will note that I am using a Bayesian interpretation of probability at this point]. The probability of the specific *real world* event that occurred (including the orientations and temperatures of the cards, the reactions of the players, and the taste of Jake Danser’s bite of Chinese dinner 500 miles away), from the perspective of any of the players, is unbelievably small.

In situations where this question is asked, I always jokingly answer the question with “1”. Most of the time people laugh and let the question rest, but sometimes the conversation turns more serious. In this case I try (in all cases so far, in vain) to explain the plurality of perspective at play here. The “serious” answer I have for this question, doing my best to interpret the intended meaning of the statement while incorporating these issues, is a number I cannot calculate but that I can describe: it is the probability that the speaker would have been astonished by the event that occurred; essentially the probability that he would remark “what’s the probability of *that*?”. I think that is quite a bit higher than one in 270,000, so the event we witnessed was not as rare as the simple calculation would have us believe.

The dissonance of such situations points to a common simplistic view of probability: that *events* (in the colloquial sense) have probabilities. A distinction that is commonly understood (and commonly misunderstood) is that between **frequentism**, which talks about running an experiment a bunch of times and calculating the proportion of experiments that satisfied some criterion, and **Bayesianism**, which talks about the confidence one has in some property being true *in terms of a subjective state of knowledge*. This is a fascinatingly subtle distinction: they coincide on most but not all questions, and there is much argument about which one is “right” (I think that is a rather silly argument, as if a probability were something bestowed to us from nature). However, both of these views are calculating the probability of a *property* (a *set of events*) being true (happening), and both of these views rely on an assumption of prior information: the frequentists on the set-up of the experiment and the Bayesians on their prior state of knowledge about the world. The idea that *an event* has a single, objective probability says something very strong about the universe: that there is some essential, natural source of randomness (to dispel any quantum objections, I point to E.T. Jaynes’s critique of the view that the randomness in quantum theory is an essential property of nature). But even if there were some essential source of randomness, surely the universe is more determined than our probabilistic calculations assume: from the view of this universe, the deck cannot be uniformly shuffled after only seven shuffles because it knows how your hands and your brain work. So we were never talking about an essential, natural randomness.

In our normal use of probabilities, we don’t run into this problem because we are predicting the future: e.g. “what is the probability that Obama will be re-elected?”. We conceive of this as a single event, but it is actually a wide collection of events. Given some prior knowledge, this set of events has a definite probability. But we are inconsistent about how we treat future events and past events: past events are not collections — they happened, and they happened in exactly one way. From the perspective of now, the probability that it would happen any other way is zero. From the perspective of before the event, we are asking whether “it” would happen or not, but we are entirely unclear about what measurable property we mean by “it”. We mean something different when we refer to events in the future and events in the past.

In summary, probabilities are functions of (1) prior information and (2) a *criterion* for judging when it is to be considered true. Probability is meaningless to apply to a *single* event.

N.B. in probability theory, the word *event* is already defined to mean a set of outcomes. If you read this article with that definition in mind, you will have been very confused :-).

If this post helped you think more clearly about probability, show your appreciation and .