The **normal distribution**, also known as the Gaussian distribution, is more familiarly known as the standard or **normal bell curve**.

As you can see from the picture, the normal distribution is dense in the middle, and tapers out in both tails. If you’ve ever had a teacher or professor “curve” the exam grades in a class, what this means is to fit the exam scores to the bell curve, or the normal distribution. When fitting to the bell curve, the grades are centered around the **mean** score (the tallest, central point on the curve, typically signified by the Greek letter μ), which becomes the equivalent of a C grade; the rest of the scores then fall somewhere around this central mean.

Because of the shape of the distribution, the bulk of the exam scores will be found in the fatter, middle portion of the curve. Thus, when fitting to a curve, the majority of students in the class will end up scoring a B, C, or D; because of the thinner tails, fewer students will be found further out from the mean, so A grades and failing grades will occur less frequently than middle-of-the-road scores.

This is the hallmark of the normal distribution–it is a distribution where the middle, the average, the mediocre, is the most common, and where extremes show up much more rarely. Because so many random variables in nature follow such a pattern, the normal distribution is extremely useful in inferential statistics. Let’s consider an example.

## Height: A normally distributed variable

To reiterate, a normal distribution can describe variables where values near the mean predominate, and extreme values are rare. Let’s take the heights of American women as an example. According to Columbia University Statistics, the average height for a woman in the US is 63.1 inches (or about 5 feet 3 inches), with a standard deviation of 2.7 inches.

Because height, like so many variables found in nature, is normally distributed, we can reasonably expect that most American women we will encounter in our lives will more likely have a height closer to 5 feet 3 inches than, say, 7 feet. For height, like all normally distributed variables, the mean predominates, and extreme values are rare. Experience confirms this: We know that giants and dwarves are much less commonly encountered than those of average or near-average height. But just how *much* more common is the middle-ground than the extremes? Let’s explore this question in a bit more depth.

## Probability distributions: a quick review

Recall that **probability distributions** are visual plots of how frequently certain values occur. In the past, you may have seen **discrete probability distributions**, which are displayed as **histograms**. The following is a *discrete* probability distribution showing the probabilities of every possible roll (from 2 to 12) of two standard 6-sided dice.

Looking at the above distribution, we can see that the probability of rolling a 7 (tallest, middle bar) is 1 in 6 (right-hand axis). The reason we call this distribution *discrete* is because *only certain values are possible*. For example, you can roll a 3 or a 4, but it is impossible to roll a 3.5.

However, our previous variable of height is *continuous*, because heights *can* take any value. A woman might be 63.1 inches tall, but she might also be 63.2 inches tall, or 63.05 inches tall. There is no restriction on how fine our gradation can be; thus, the variable is *continuous*.

**The normal distribution is a continuous probability distribution function**

Now we are ready to consider the normal distribution as a **continuous probability distribution function.** Unlike with discrete probability distributions, where we could find the probability of a single value, for a continuous distribution we can only find the probability of encountering a *range* of values.

For example, using the normal distribution, we *cannot* answer the question, “What is the probability that a random woman in New York City is 63.1 inches tall?” This is because the distribution is continuous and not discrete; we cannot specify values. However, we *can* answer the question, “What’s the probability that a random woman in New York City is *between* 60.4 and 63.8 inches tall?” (Answer: 68%). Let’s look at an example.

## The 68-95-99.7 Rule

The 68-95-99.7 Rule says that for any normally distributed random variable,

**population**will lie within 1 standard deviation, 1σ, of the mean

Let’s apply this to our height example. Earlier, we encountered the fact that the mean height of women in the US is 63.1 inches, and the standard deviation is 2.7 inches. According to the 68-95-99.7 rule, 68% of all women should have heights within one standard deviation, or 2.7 inches, of the mean. We can calculate this interval as follows:

63.1 ± 2.7 = {60.4,65.8}

Therefore, we expect that 68% of women in the US to have heights between 60.4 and 65.8 inches. The 68-95-99.7 rule is a useful, fast rule of thumb for determining probabilities under the curve. Note that the *entire* area under the curve equals 100%–this will always be the case for any probability distribution function, since all the probabilities for all possible values must add to 100%. For more information on calculating more precise probabilities under the normal curve, check out this post on z-scores.

## Effect of variance on the normal distribution curve

So far, we’ve been talking about the normal curve as if it is a static thing. However, it might be more accurate to talk of normal *curves*, plural, as the curve can broaden or narrow, depending on the **variance** of the random variable. No matter the shape of the curve, however, three things will always be true:

Now that we know what is common to all normal curves, let’s explore what causes them to broaden or narrow. Generally, if a variable has a higher **variance** (that is, if a wider spread of values is possible), then the curve will be broader and shorter. However, if the variance is small (where most values occur very close to the mean), the curve will be narrow and tall in the middle. Check out the following graphic for a visual.

## Conclusion

The normal distribution, or bell curve, is broad and dense in the middle, with shallow, tapering tails. Often, a random variable that tends to clump around a central mean and exhibits few extreme values (such as heights and weights) is normally distributed. Because of the sheer number of variables in nature that exhibit normal behavior, the normal distribution is a commonly used distribution in inferential statistics.

For more information and practice with the normal distribution, check out our statistics videos and lessons!

## Comments are closed.