# What is Empirical Probability?

In probability theory, empirical probability is an estimated probability based upon previous evidence or experimental results. As such, empirical probability is sometimes referred to as experimental probability, and we can distinguish it from probabilities calculated from a clearly-defined sample space.

Let’s first compare and contrast empirical probability and theoretical probability. Then we’ll look at an example problem that relates empirical probability to the important concept of expected value.

### Empirical probability: A definition and example

Empirical probabilities are based upon how likely an event has proven in the past. Thus, they are always estimates.

A great and common example of an empirical probability is a player’s batting average in baseball. For example, according to ESPN.com at the time of this writing, Philadelphia Phillies power hitter Ryan Howard has a career batting average of .258. This is found by computing the following ratio:

Batting average = # of hits / # of at-bats

Batting averages are a common example of an empirical probability. Photo by Dan Gaken.

We can use the batting average as an empirical probability if we want to estimate how likely it is that Ryan Howard gets a hit on his next at-bat. Using his batting average as an empirical probability, the likelihood that Howard gets a hit is 0.258, or 25.8%. That’s because, if we look at all the at-bats Howard has taken thus far in his career, he’s gotten a hit 25.8% of the time, or about one in every four at-bats.

(Does this mean that if Howard has struck out the past three times he’s been at bat, that his next time at the plate he’ll surely get a hit? Of course not. For one, empirical probabilities are simply estimates based on past observation. Also, it’s probably not a bad assumption to consider consecutive at-bats to be independent of one another.)

### Distinguishing empirical probability from calculated probability

In contrast to empirical probabilities, which are estimates, calculated probabilities involving distinct outcomes from a sample space are exact. For example, if the event X consists of m desired outcomes within a total sample space of n possible outcomes, then the probability of X will be:

P(X) = m/n

To illustrate this concept, let’s consider the probability of drawing an ace from a standard, shuffled deck of fifty-two cards. In this case, there are four aces in the deck, so m = 4. Since there are fifty-two total cards in the deck, n = 52. Therefore,

P(ace) = 4/52

Note that we can compute this probability without actually drawing any cardsmdash;unlike with empirical probabilities, it is unnecessary to observe and experiment beforehand. Also, this probability is exactly 4/52—it’s not an estimate. Both of these facts distinguish probabilities calculated from sample spaces from empirical probabilities, which are always estimates based on past data.

As opposed to an empirical probability, which is an estimate based on past data, the probability of drawing an ace can be calculated exactly. Photo by Cukierek.

### Practice problem: Empirical probability and expected value

Here’s an example problem to illustrate how calculating an empirical probability can be used to make predictions. Let’s assume a fictitious start-up company, Empiricus Enterprises, is increasing producing of its three products–Product X, Product, Y, and Product Z. In the table below, we have the number of units sold of each product in 2017. However, in 2018, the company plans to increase its production, and it estimates that it can manufacture 7,000 total units of X, Y, and Z. Now, of those 7,000 units, how many should be of product Y?

Let’s start by computing the empirical probability that a random customer, when faced with buying X, Y, or Z, will choose Y. From last year’s data, the proportion of units sold that were product Y was:

350/(250+350+400) = 350/1000 = 35%

Now we can use this empirical probability to make a prediction, or find the expected value, of the number of units of Product Y that will likely be sold out of next year’s 7,000 units:

35% * 7,000 = 2,450

Therefore, we estimate that about 2,450 units of Product Y will sell, based on last year’s data. Again, this is only an estimate, and the actual number of units sold will likely vary somewhat.

For a more rigorous look at the mathematics of expected value, check out this PDF from Dartmouth University. Or, if you need more help with the fundamentals of empirical probability, then check out our statistics videos and blog posts here!