Statisticians point out that it’s often useful to “chunk” data to understand it. What does it mean to “chunk” data? It means dividing a long list into smaller chunks so that, with a few well-chosen numbers, we can get a sense of the layout of the list.

The fundamental “chunking” number is the median. The median is the middle of the list: that is, it divides the list into two chunks: an upper list and a lower list. This one number, the median, tells you both the maximum of the lower list and the minimum of the upper list.

## Quartiles

Quartiles extend this idea. First, find the median, which divides the entire list into a “top 50%” list and a “bottom 50%.” Now, find the medians of each one of these lists. The median of the “bottom 50%” called Q_{1}, the **first quartile**. The median of the “top 50%” is called the **third quartile**. The quartiles are called “quartiles” because the two quartiles and the median nicely divide the list into four equal chunks.

- the lowest 25% of the list is below the first quartile
- the next 25% of the list is between the first quartile and the median
- the next 25% of the list is between the median and third quartile
- the highest 25% is above the third quartile.

Notice that, we don’t use the term “second quartile” because the median plays the role of the second quartile.

## The Interquartile Range

Often, statisticians are bothered by outliers, that is, extreme high or low values. An outlier is a member on the list who is not representative of most of the list. In the list of household incomes in the US, the incomes of Bill Gates and Warren Buffett are not representative of the rest of us: they are outliers. Outliers, by definition, will always be at the very top or the very bottom of a list.

Notice that both the “top 50%” and the “bottom 50%” will necessarily contain any outliers. Would it be possible to talk about a “half” of the population that definitely contains no outliers? Well, instead of the “top 50%” or the “bottom 50%”, we could take the “**middle 50%**“. What’s that? Well, suppose we look at all the folks between the first quartile and third quartile. We know that a quarter of the population is between the first quartile and the median, and a quarter between the median and the third quartile, so between the first quartile and the third quartile is 50% of the population, and it’s the 50% that’s in the middle of the population. This is called the **interquartile range**: the set of data entries from the first quartile to the third quartile. It’s a big deal because it’s not the upper half or lower half but rather the **middle half** of the data. For this reason, statisticians feel it gives a very good representation where the typical data lie.

## An Example with Real Data

Consider the geographic size of countries. On planet Earth, what is the size of a typical country? Well, if we list the countries and their areas, we find the maximum is Russia (16,995,800 sq km) and the minimum is the Holy See (0.44 sq km). Obviously, neither one of those is typical of the area of a country.

The median value on the list is 50660 sq km (Costa Rica). So that’s interesting: half the countries on Earth have more area than Costa Rica, and half have less. Incidentally, the US State of West Virginia is slightly bigger than this, so little old West Virginia has more area than half the countries on Earth. Who would have thought that? 🙂

The third quartile is 325360 sq km (Vietnam) and the first quartile is 572 sq km (the Isle of Man). So, even within the interquartile range, there’s huge variation from 572 up to 325360. Still, we can say half the countries on Earth have more area than the Isle of Man but less area than Vietnam. That’s where the middle 50% lies. That would be, in many ways, the most representative range for the size of a “typical” country.

Hi Expert,

I see that you mentioned that “First, find the median, which divides the entire list into a “top 50%” list and a “bottom 50%.” Now, find the medians of each one of these lists. The median of the “bottom 50%” called Q1, the first quartile. The median of the “top 50%” is called the third quartile “. But for one of the practice problem in Magoosh, the same is explained the other way round. That is, it said the bottom “bottom 50%” is the third quartile and the top 50% is the first quartile.

I’m confused now. Please help.

Hi Prad,

I see you’re a GRE Premium member. I’ll have one of our student help tutors email you to take a closer look at that problem. 🙂

Consider the geographic size of countries. On planet Earth, what is the size of a typical country? Well, if we list the countries and their areas, we find the maximum is Russia (16,995,800 sq km) and the minimum is the Holy See (0.44 sq km). Obviously, neither one of those is typical of the area of a country.

The median value on the list is 50660 sq km (Costa Rica). So that’s interesting: half the countries on Earth have more area than Costa Rica, and half have less. Incidentally, the US State of West Virginia is slightly bigger than this, so little old West Virginia has more area than half the countries on Earth. Who would have thought that?

PLEASE ANYONE EXPLAIN THIS QUESTION..

Hi Bhumica,

I’m happy to help you to understand this question, but can you please give me some more information about where you are struggling? If you can tell me a bit more about what you understand or don’t understand, I can provide more targeted help 🙂

Hi,sir

Your explanation is very useful & interesting, thanks a lot !

Dear Endong,

You are quite welcome, my friend. Best of luck to you!

Mike 🙂

QUARTILES concept was totally new for me,being a medical student….BUT it was best explanation I can ever find……

Dear Mehwish,

I’m glad you found this helpful. Best of luck to you in your studies.

Mike 🙂

Sir say a question to find M,Q1,Q3 for the following list of 13 numbers:

1,3,3,5,8,9,9,10,13,14,17,18,20

The mean is is easy to spot as it divides the list into 2 equal halves ;now how to calculate Q1(or Q3) ? Is it done by taking average of the middle elements because the number is even?

Praneeth: First of all, it’s the *median* — not the mean — that is “easy to spot”. Here, the median is that second 9. When the median is a number on the list (typically, when there is an odd number of entries), remove that number and break the list into two “half lists” — an “upper list” and a “lower list”. Here, the lower list is {1,3,3,5,8,9} and the upper list is {10,13,14,17,18,20}. Q1 is the median of the lower list = 4. Q3 is the median of the upper list = 15.5. Notice, both of those lists had even numbers of entries, so we had to take the average of the two middle numbers to find the median.

Does all that make sense?

Mike 🙂

I got it Thank You sir for your clarification

I’m glad you understand.

Mike 🙂

I believe mean do not always divide into two halves, rather it moves towards the higher frequency pull (like weighted average). Please clarify if this is true.

Hello Harminder,

Yes, you are correct! The median is what divides a set of data into two halves, but the mean may be less than, more than or equal to the median. Consider this data set: 1,2,3,10,15. The median is 3 because it is right in the middle. To find the mean we need to add up all of the numbers and divide by the quantity of numbers. So we get 31/4, which is about 7.75. The mean is much higher than the median (middle) of the set, which we would expect since the final two numbers (10 and 15) are much larger than the first three numbers. So, 10 and 15 ‘pull’ the mean away from the median.

sir,

if the list has even no of quantities how do we find quartiles

Praneeth: I don’t understand the situation you are describing. The question always have to provide *some* kind of information. Please provide a specific question, and we will be happy to address that. You can email you question to gre@magoosh.com.

Mike 🙂

Sir consider a list of numbers:1 2 2 4 5 7 10 11 12 14 17 18 20. Mean can be deduced to be 10 as it divides the data into 2 equal halves now how do i calculate Q1(or Q3)? because the number of elements after splitting is even

Praneeth your questions are very interesting as they also help me

understand this topic. However, you keep calling the middle number

mean…it is the median not mean.

Hi

that great explained but I asked about the percentage 34% and other percentages, How get it?

thanks

Dear seear:

Please see this post:

https://magoosh.com/gre/2012/normal-distribution-on-the-gre/

and let us know if you have any further questions.

Mike 🙂

Vivid explanation….thanks

You are quite welcome. Best of luck to you in your preparations, and please let us know if you have any questions.

Mike 🙂

No mention! I am glad you fixed it.

soumya — Great catch! You were spot-on correct! We fixed the mistake. Thank you very much! Mike 🙂

first sentence of last paragraph- Do you mean that isle of man is first quartile and vietnam is 3rd quartile?

The post was very much informative and cleared all my doubts on Interquartile range. Looking forward to similar kind of posts regarding Percentiles…