**Quartiles** are used to summarize a group of numbers. Instead of looking a big list of numbers (way too unwieldy!), you are looking at just a *few* numbers that give you a picture of what’s going on in the big list. Quartiles are great for reporting on a set of data and for making box and whisker plots. Quartiles are especially useful when you’re working with data that isn’t symmetrically distributed, or a data set that has outliers.

All numerical summaries—like mean, median, and mode—give you a few numbers to summarize a large group of data, but what’s special about quartiles is that they split the data up into **four equal-size groups**.

These are special cases of “percentiles.” A percentile tells you what number is higher than a certain percent of the rest of the dataset. For example, the “90th percentile” means the number that is higher than 90% of the other numbers in the group.

(By the way, some scientific studies use quintiles instead of quartiles—as you might guess from the name, quintiles divide the data set into *five* equal-size groups, instead of four groups like quartiles do.)

## Definition of Quartiles

There are five numbers that make up the “quartiles,” although some of the five numbers have more common names. Quartiles are the five numbers you need to split a group of numbers into four equal-size groups. Here they are, from lowest to highest:

- Minimum, or (rarely) “0th percentile”—the smallest number in the group.
- 1st quartile, Q1, or 25th percentile—the number that separates the lowest 25% of the group from the highest 75% of the group.
- Median, or 50th percentile—the number in the middle of the group, when arranged from smallest to largest.
- 3rd quartile, Q3, or 75th percentile—the number that separates the lowest 75% of the group from the highest 25% of the group.
- Maximum, or (rarely) “100th percentile”—the largest number in the group.

Notice the pattern in the percentile numbers: each successive quartile marks another 25% (one-quarter) of the data set.

Finally, one other related statistic is the **interquartile range**, or IQR: it’s the distance between the first quartile and the third quartile. The IQR is useful in calculating **outliers**. Any data value that is more than 1.5 times the IQR away from that central 50% group is called an outlier.

So, to check for outliers, you need to do two calculations:

- Lower boundary is Q1 – 1.5 * IQR
- Upper boundary is Q3 + 1.5 * IQR

Any value that is *lower* than the lower boundary or *higher* than the upper boundary is an outlier.

## Example of Using Quartiles

Let’s work with a small list of numbers first. Here are the heights of eight NBA point guards (in inches):

72, 72, 73, 73, 74, 75, 76, 79

With a small group of numbers, find the median by seeing which data value is in the middle. Since there are an even number of data values in this list, the median is halfway between the middle two values (73 and 74), so the median is 73.5. (For a more in-depth look at calculating the median and other numerical summaries, as well as lots of other statistics topics, take a look at our statistics video lessons.)

A quick way to estimate the 1st quartile is to look at only the data values which are *less than* the median:

72, 72, 73, 73

and find the median of that lower half. In this case, 72.5. The fact that 72.5 is the 1st quartile tells us that a quarter of the data values are less than 25, and the rest of them are higher than 25. In terms of basketball players, 25% of this group of basketball players is shorter than 72.5 inches, and 75% of the group is taller than 72.5 inches.

Similarly, to see the 3rd quartile, look at only the data values that are *greater than* the median:

74, 75, 76, 79

and the 3rd quartile will be the median of this half of the list. So the 3rd quartile is 75.5.

To report all the quartiles together, we list the five number summary (minimum, 1st quartile, median, 3rd quartile, and maximum):

72, 72.5, 73.5, 75.5, 79

From this list of five numbers, we learn a lot! We know the heights of the shortest and tallest people in the group, as well as the median (the halfway point). We *also* can see where the middle 50% of the basketball players are—the central 50% of players are between 72.5 inches (1st quartile) and 75.5 inches (3rd quartile).

Finally, the interquartile range (IQR) is 3, telling us that the middle 50% of the group are separated by only three inches.

## Why do quartiles matter?

Quartiles let us quickly divide a set of data into four groups, making it easy to see which of the four groups a particular data point is in.

For example, a professor has graded an exam from 0-100 points. Say that professor wants to give bonus points to the top 25% of students, remedial instruction to the bottom 25% of students, and a chance for extra credit to the middle 50% of students. If the professor knows the quartiles are 55, 62, 75, 88, and 95, then it makes it easier to see where the dividing lines are. Got a 73? You’re in the middle 50%. Got an 89? Congrats, you’re in the top 25% of the class!

The middle 50% of the data can useful to know about, especially if the data set has outliers. If the minimum or maximum values are far away from the central 50%, then there are probably some outliers in the data set.

## Technical Stuff: Locating Quartiles

The method above (median of the top half, median of the bottom half) is fine for small sets of data, but for larger sets, we have a different way to find the quartiles.

It all starts with N, the number of values in the data set.

To find the first quartile, or 25th percentile, multiply 0.25 by N, and then round up (or, if you happen to get a whole number, see the *Note* below). This gives you the location or “index” of the first quartile. It’s like an address, that you can then use to go find the actual value of the first quartile.

So in a list of 110 test scores, the first quartile has an index of 0.25 * 110 = 27.5, which we round up to 28. That means that the first quartile is located in the 28th spot. Arrange the data values from smallest to largest, and count out to the 28th number in the list. Whatever number is there is the first quartile.

*Note*: if you got a whole number when multiplying by N, add 0.5 instead of rounding up. For example, in a list of 700 numbers, the third quartile has an index of 0.75 * 700 = 525, which we add 0.5 to, and get 525.5.

An index with a 0.5 ending means we need to take the average of two numbers: the ones on either side of the index. If the index is 525.5, we need to find the numbers in the 525th spot and the 526th spot, and average them together.

## Quartiles Summary

Some statistics only tell us about the center of the data, or a typical value. Other statistics, like range and standard deviation, tell us something about the spread of the data values. Quartiles do both!

How? Well, the median tells us the center of the data set, while the first and third quartiles tell us about how spread out the middle 50% of the data set is. Finally, the minimum and maximum values tells us about the most extreme values in the data set.

## Comments are closed.