The post GRE Math: What’s the Difference Between Combination and Permutation? appeared first on Magoosh GRE Blog.

]]>Do you know the difference between permutation and combination? No? You’re not alone. Combinations and permutations are the bane of many students. Yet, what I’ve noticed over the years is it’s not so much both of them that are the issue as it is which one to use for a particular problem: the combination vs permutation question. In other words, students have no difficult identifying whether a question is a combinations/permutations problem. The difficulty is in knowing exactly which one it is—combinations or permutations?

One way to think of it is to think of **permutations** as the number of arrangements or orderings within a fixed group. For example, if I have five students and I want to figure out how many ways they can sit in five chairs, I’m going to use the permutations formula. First off, the number in the group is fixed. Secondly, I’m looking for how many ways I can “arrange” the students in five chairs.

**Combinations**, on the other hand, are useful when figuring out how many groups I can form from a larger number of people. For instance, if I’m a basketball coach and I want to find out how many distinct teams I can form based on a group of people, I want to use combinations.

To make sure you understand this important distinction, here are three different scenarios. Your job is not to solve the question but to determine whether you use the combinations or the permutations formula to solve them.

1. Joan has five panels at home that she wants to paint. She has five different colored paints and intends to paint each panel a different color. How many different ways can she paint the five panels?

2. How many unique combinations of the word MAGOOSH can I form by scrambling the letters?

3. There are seven astronauts who are trying out to be part of three-person in-space flight team. How many different flight teams can be formed?

1. Permutations

She wants to arrange the colors. The number of panels is fixed. Had she been choosing five panels from a total of 8, let’s say, then we would need to use combinations.

2. Permutations

Okay, this was a little bit of a trick, since I used the word “combinations”. But that word I used colloquially, not mathematically. In this case, the number of letters is fixed. We are simply rearranging them.

3. Combinations

We are choosing a group from a larger group. One way to think of it is that when you use the word choose in the context of selecting from a group, you are dealing with combinations. And “choose” and “combinations” both begin with the letter ‘C’. There is an exception to this rule that I’ll talk about in the next section.

Big Idea: If you are forming a group from a larger group and the placement within the smaller group is important, then you want to use permutations.

Imagine a group of 12 sprinters is competing for the gold medal. During the award ceremony, a gold medal, silver medal and bronze medal will be awarded. How many different ways can these three medals be handed out?

Remember that with permutations ordering is key. Even though the top three spots for the sprinters forms a subgroup, it is the ordering within that subgroup that matters greatly, and is the difference between a gold, silver, and a bronze medal. An easy way to solve this question mathematically is to imagine that the dashes below are the podium upon which each sprinter will stand (albeit the dashes are at the same level):

____ ____ ____

gold silver bronze

To find out the number of different arrangement, ask yourself how many athletes can stand on the gold medal podium? Well, we have a total of 12 athletes. What about the silver medal podium? Now, we have one athlete fewer—since one is already on the gold medal podium. So that gives us a total of 11 for the silver medal spot. Finally, that leaves us with 10 athletes for the bronze medal podium.

The math looks like this:

12 x 11 x 10 = 1320

You might notice that this is the fundamental counting principle. The idea that when we are looking for total number of outcomes, we multiply numbers—or in this case, the number that goes on top of each dash—together. For instance, if I have six pairs of shorts and four pairs of t-shirts and I’m wondering how many different combinations of shorts and t-shirts I can wear, I want to multiply each, not multiply them:

___4_______ x _____6______ = 24

# of shirts # of shorts

I do not want to add them, which would give me 10, the wrong answer, in this case.

The reason I bring up the fundamental counting principle is that some questions will actually combine combinations with the fundamental counting principle (though you’ll likely use the fundamental counting principle more often with permutation questions). To give you an idea of how a combinations can show up along with the fundamental counting principle, try the following question:

Mrs. Pearson has 4 boys and 5 girls in her class. She is to choose 2 boys and 2 girls to serve on her grading committee. If one girl and one boy leave before she can make a selection, then how many unique committees can result from the information above?

(A) 9

(B) 12

(C) 18

(D) 22

(E) 120

The first step in this problem is recognizing whether we are dealing with combinations or permutations. Since, I am ‘choosing’ from a larger group, in this case two separate groups, I want to use combinations. Remember: once we’ve chosen the 2 boys or 3 girls, the position within the committee doesn’t matter. That is, either you are in the committee or out of the committee (there are no gold medalists here!)

The next thing to notice with this problem is that of the original 9 students, 2 leave, one boy and one girl. So that leaves us with 3 boys and 4 girls. We want to choose two each. Therefore, we have to set up one combination for boys and one for girls.

For boys, we have 3C2 and for girls we have 4C2. This gives us: 3C2 = 3 and 4C2 = 6

The final step is once we’ve figured out the combinations above, we have to use the fundamental counting principle and multiply the total number of possibilities in a committee, not add them: 3 x 6 = 18, answer (C).

Many students get stuck on this step and wonder, why don’t I add them. It’s a good question, so what I want you to do is imagine that you have 3 shirts and 3 pants. How many different shirt-pants getups can you wear? Well, for each shirt there are 3 options of pants. Therefore, we multiply and get 9.

My advice is to try about 5 or 6 more combinations/permutations problems so that you can get the hang of it. With a little practice, you’ll able to deal with most of the problems the GRE hands you. Even if you miss a question—likely because it is very difficult—the fundamentals in this post should be enough to help you understand the explanation to that question, so that you can get a similar question right in the future.

*Editor’s Note: This post was originally published in May 2011 and has been updated for freshness, accuracy, and comprehensiveness.*

The post GRE Math: What’s the Difference Between Combination and Permutation? appeared first on Magoosh GRE Blog.

]]>The post Does Order Matter? Combinations vs. the Fundamental Counting Principle on the GRE appeared first on Magoosh GRE Blog.

]]>Let’s look at a couple of examples to illustrate.

Suppose we have 5 people waiting for 3 seats. Now let’s say I want to count how many ways 3 people can be arranged in those seats. There are three tasks here–one for each seat.

*fill 1st seat, fill 2nd seat, fill 3rd seat*

We can write those as 3 blanks:

_____ , _____ , _____

With this done, we’re most of the way there. We just need to check if filling one seat with one person is the same as filling another seat with that same person.

Amy , _____ , _____

_____ , Amy , _____

Now, we know those aren’t the same, because that makes two different line-ups. Keep in mind which task each blank represents.** ***Amy in seat 1 is not the same as Amy in seat 2*** .** So this is

5 options , 4 options , 3 options

5*4*3 = 60.

If we were simply looking for a group of people–not paying attention to which person takes which chair, then we’d use a combination because the tasks aren’t inherently separate. There would be no significance of which blank represents which task. That might be asking “how many ways three people can be chosen for a committee from a group of five,” for example. In that case, we *can’t* draw the tasks as separate. Selecting the first person for the committee is the same as as selecting the second person. We could even choose all three at the same time, in one task–we don’t have to break it up into separate events. In that case, we have to use the combination formula as given in the our combination lesson videos.

5C3

5! / ([5-3]! *3!)

5! / ([2]! *3!)

5*4 / 2

10

Here’s another example. If you have 4 shirts, 4 pairs of pants, and 4 hats, and you choose one of each, is it a combination or does order matter? It’s tempting to say “order doesn’t matter”, because a t-shirt and jeans is the same as jeans and a t-shirt, but let’s try making those 3 tasks:

*pick a shirt, pick pants, pick a hat*

or

_____ , _____ , _____

Again, remember that each blank must represent one task alone, and we don’t move the blanks around. So if we say

red t-shirt, _____ , _____

That’s NOT the same as

_____ , _____ , red t-shirt

*A shirt in the first blank is not the same as a shirt in the last blank,* because the last blank is for choosing a *hat* and a shirt cannot be a hat. It’s not very fashionable at the moment, at least.

That means that we can use the fundamental counting principle here. First, we look at the shirt task. There are 4 shirts, so we have 4 possibilities. Next, and *separately, *we look at the pants. There are 4 pairs, so that’s another 4 possibilities for each shirt. So far, that’s 4*4. And finally, we have hats: there are another 4 hats, so that’s another 4 possibilities for each match up of shirt and pants. That’s 4*4*4.

So if we have *separate* blanks for each task, just fill in the blank with how many choices can be made in that specific step. Then multiply all the numbers for your answer.

On the other hand, if we were looking for ANY three pieces of clothing from the twelve total, we would use a combination formula. Again, this could be done in a single step–reach into a bag of clothes and grab three things. There’s no differentiation between them. This would give us 12C3, or 220.

Basically, this is all about drawing the blanks to represent the tasks. As you do so, ask if the item or person you’re picking can be moved around to different blanks without changing the situation. (Move the items–not the blanks!). If the item can be moved without changing the situation, it’s a combination–order doesn’t matter. If moving the item changes the situation (like in the line-up) or is impossible (like with the clothes), then we use the FCP.

The post Does Order Matter? Combinations vs. the Fundamental Counting Principle on the GRE appeared first on Magoosh GRE Blog.

]]>The post How Many Statistics Questions are on the GRE? appeared first on Magoosh GRE Blog.

]]>Back in the days of the old GRE, there was only one book out on the market written by ETS (the creators of the test): the hoary 1991 tome *Practicing to Take the GRE.* At that point, there were few, if any questions, relating to statistics (probably a straightforward median and mode question).

Of course many students would come out of the test reporting how many questions they’d seen on standard deviations or weighted averages. There was a clear discrepancy between what the GRE led students to believe was on the test, and what was actually on the test.

Surprisingly not that much has changed. Sure, the new Official Guide has an entire section on Statistics but only a few questions are actually scattered throughout the book, leaving students unsure how many statistics questions are on the GRE. Kaplan and the other usual suspects also give short shrift to this concept, so I still hear the refrain: there were lots of standard deviation questions.

Don’t get me wrong – the GRE hasn’t become one big test on statistics. But if we were to take all the median/mode, averages/weighted averages, and standard deviation questions on a GRE, there could be as many as eight questions, or roughly 20% of the test. Imagine doing only a few practice problems, only to miss six out of those eight questions. You are already close to 160 out of 170.

My advice is to not only go through the section in the Official Guide to the GRE, but to also do as many statistics practice questions as possible. The thing is statistics looks deceptively simple when the books cover the usual mean, median, mode business. Questions on the GRE take this relatively straightforward knowledge and concoct these fiendishly difficult questions.

By practicing such questions the hope is they won’t become so fiendish. The 5 lb book has plenty of statistics questions. For even tougher questions, Magoosh has quite a few.

At the same time, you should not be ferreting through your college textbook on statistics, trying to find the standard deviation to the nearest thousandth on a set of a hundred numbers.

The statistics on the GRE is more of a big picture, more conceptual than it is about crunching numbers. This fact will become evident once you start doing a few questions. To give you a foretaste: solving a standard deviation question is less about using the cumbersome formula and more about getting a sense of the standard deviation based on the numbers. Doing so will allow you to eliminate most, if not all, of the incorrect answer choices.

The post How Many Statistics Questions are on the GRE? appeared first on Magoosh GRE Blog.

]]>The post How Many Probability Questions are on the GRE? appeared first on Magoosh GRE Blog.

]]>There are few concepts on the GRE that frighten students more than that of probability. Many go out of their way to study difficult probability questions, agonizing over concepts that are beyond the scope of the GRE. Test day, though, you might only see one probability question. You will probably see two, though you might see three. I doubt you’ll see any more than that. The number of probability questions on the GRE varies, but you can be confident that you’ll only see a few instances of probability on your test.

The type of probability questions you will see range from coin tosses to objects being removed from a group without replacement. While these questions can be very tough, the good news is it doesn’t get much tougher.

Perhaps one curveball that the GRE will throw at you is combining geometry and probability. For example, you might get a question that asks you what the probability a certain point chosen at random will fall within the shaded region of the section. But again do not agonize over such a possibility. You most likely will not see such a question. Even if you do, getting the hang of these questions—indeed most probability questions on the test—just takes a little practice.

Speaking of, here’s a practice question to get your brain going (but hopefully not too worried):

GRE Probability Practice Question

So if you’re worried about how many probability questions are on the GRE, just take a breather and remember the probability is very low you’ll see a lot of probability! Okay, okay, I’ll stop with the probability jokes. But consider this: in the short time that you’ve been reading this article, it’s possible that you’ve already passed more time than you will spend on probability come test day! Hopefully that puts some things in perspective for you as you’re studying.

What scares you the most about probability? I’ll give you a hint: it should probably be that you’ll see a couple probability questions and have wasted precious study time! Anyway, let us know below. 🙂

The post How Many Probability Questions are on the GRE? appeared first on Magoosh GRE Blog.

]]>The post GRE Math: Histograms appeared first on Magoosh GRE Blog.

]]>First, a practice question about the following scenario.

In a survey, 86 high school students were randomly selected and asked how many hours of television they had watched in the previous week. The histogram below displays their answers.

1)

First, a reminder on histograms. Histograms are not simple bar or column charts. A histogram, like a boxplot, shows the distribution of a single quantitative variable. Here, we ask each high school student, “How many hours of TV did you watch last week?”, and each high school student gives us a numerical answer. After interviewing 86 students, we have a list of 86 numbers. The histogram is a way to display visually the distribution of those 86 numbers.

The histogram “chunks” the values into sections that occupy equal ranges of the variable, and it tells how many numbers on the list fall into that particular chunk. For example, the left-most column on this chart has a height of 13: this means, of the 86 students surveyed, 13 of them gave a numerical response somewhere from 1 hr to 5 hrs. Similarly, each bar tells us how many responses were in that particular range of hours of TV watched.

The median is the middle of the list. Here, there is an even number of entries on the list, so the median would be the average of the two middle terms — the average of the 43rd and 44th numbers on the list. We can tell that the first column accounts for the first 13 folks on the list, and that the first two columns account for the first 13 + 35 = 48 folks on the list, so by the time we got to the last person on the list in the second column, we would have already passed the 43rd and 44th entries, which means the median would be somewhere in that second column, somewhere between 6-10.

To calculate the mean, we would have to add up the exact values of all 86 entries on the list, and then divide that sum by 86. In a histogram, we do not have access to exact values: we only know the ranges of numbers — for example, there are seventeen entries between 11 hrs and 15 hrs, but we don’t know exactly how many students said 11 hrs, how many said 12 hrs, etc. Therefore, ** it is impossible to calculate the mean from a histogram**. No one will ask you to do that. No one could reasonably expect you to do that, precisely because it is, in fact, impossible.

If it’s impossible to calculate the mean, then how in tarnation can the GRE expect us to compare the mean to the median? Well, here we need to know a slick little bit of statistical reasoning. Consider the following two lists:

List A = {1, 2, 3, 4, 5}

median = 3 and mean = 3

List B = {1, 2, 3, 4, 100}

median = 3 and mean = 22

In changing from List A to List B, we took the last point and slid it out on the scale from x = 5 to x = 100. We made it an “**outlier**“, that is a point that is noticeably far from the other points. Notice that median didn’t change at all. The median doesn’t care about outliers. The median simply is not affected by outliers. By, contrast, the mean changed substantially, because, unlike the median, **the mean is sensitive to outliers**.

Now, consider a symmetrical distribution of numbers — it could be a perfect Bell Curve, or it could be any other symmetrical distribution. In any symmetrical distribution, the mean equals the median. Now, consider an asymmetrical distribution: if the outliers are yanked out to one side, then the median will stay put, but the mean will be yanked out in the same direction as the outliers. **Outliers pull the mean away from the median**. Therefore, if you simply notice on which side the outliers lie, then you know in which direction the mean was pulled away from the median. That makes it very easy to compare the two. The comparison is purely visual, and involves absolutely no calculations of any sort. (Yes, sometimes you can “do math” simply by looking!)

Having read this, you may want to look at the QC above before reading the solution below.

1) If you think you have to calculate both the median and the mean, then this question would be impossible, since it’s impossible to calculate the mean from a histogram. If you know the trick discussed above, then all we have to notice is that the outliers, the points most distant from the central hump, are at the upper end. They are on the “high side” of the hours scale. The median probably just sits inside that central hump, but the mean has been pulled away from the median in the direction of the outliers, that is, in the direction of the high side of the scale. That means, the mean is higher up on the hours scale than is the median. That means, the mean is greater than the median. Answer = **A**

Notice, this solution involves zero calculations. It is 100% visual.

The post GRE Math: Histograms appeared first on Magoosh GRE Blog.

]]>The post Best Fit Lines in GRE Data Interpretation appeared first on Magoosh GRE Blog.

]]>One category of graph you certainly could see on GRE Data Interpretation questions is the **scatterplot**, and its associated idea of the **best fit line**. Let’s talk about how these beasts operate!

To begin, let’s review scatterplots. When each data point (each person, each car, each company, etc.) gives you a value for two different variables, then you can graph each data point on a scatterplot. Here’s an example. Suppose we survey ten students who came from the same high school to the same college. We ask each student for their total SAT score (M + CR + W) and their GPA in the first semester of their freshman year in college. Each student appears below as a single dot, the location of which shows that student’s SAT score and first semester GPA.

As one would expect, there’s a general “upward” trend: students with higher SAT scores tended to perform better in their freshman year of college. At the same time, there’s some chance variation: right in the middle, three students all scored in the 1700’s on their SATs but, for whatever reasons, had different results in the first semester of their freshman year.

We see there’s a general “upward” pattern to this scatterplot. Suppose we wanted to make a *prediction* based on that pattern. For example, a current high school senior in this high school, planning to attend this same college, would know her SAT score and might be curious about her predicted GPA in her upcoming freshman year of college.

We formalize this pattern by drawing what is sometimes called a “best fit” line. Excel calls this a “trendline.” The official name in Statistics is the Least-Squares Regression Line, but you don’t need to know that. Nor do you need to understand the mathematical details of why this line, as opposed to any other possible line, is in fact the “best fit.”

Here’s the same graph with a best fit line.

The best fit line abstracts a common pattern from the individual data points. The best fit line represents the expected relationship: if we know a new student’s SAT score, then, on average, what would we predict for that student’s first semester college GPA? One student appears almost exactly on the best fit line (sometimes a data point or two will be on the trendline, and sometimes none will be); in this case, we can say that student’s GPA is more or less what we would expect from her SAT score. There are five dots clearly above the best fit line: these five students had higher GPA’s than what we would have predicted from their SAT scores. Four dots are below the line: those four students had first semester GPA’s lower than what we would expect, given their SAT score. Notice that questions of the form “how many individuals had a higher/lower (y-value) than what we would expect from their (x-value)?” are simply asking you to count dots above or below the best fit line.

We also need to make a distinction between people or data used to generate the line, and the new data points predicted by the line. In this case, we used 10 people to generate the best fit line. We have no predictions to make about those 10 people: both their SAT scores and first semester GPA’s are known, now things of the past. If we are asked for the now-completed first semester GPA of the person who had a 1780 SAT score, we look for that dot: that’s the low dot in the middle of the graph, with a value of 2.7 for the GPA (too much first semester partying for that person?) A very different question is: suppose a new person, a high school senior, has a 1780 SAT score and would like to predict her first semester college GPA. For a prediction, we are looking not at any individual point but at the line: the line has a y-coordinate of about 3.2 there, so, on average, we would predict GPA of about 3.2 for this current high school senior.

The past are the dots, the future is the line.

Here’s a practice question to test your understanding of the best fit line: http://gre.magoosh.com/questions/2290

The post Best Fit Lines in GRE Data Interpretation appeared first on Magoosh GRE Blog.

]]>The post GRE Math: Percentiles and Quartiles appeared first on Magoosh GRE Blog.

]]>Fact: An 8 year old boy who is 4’5″ (53 inches) tall is in the 86th percentile for height for his age.

What on earth does that mean? Well, the percentile of an individual tells you what percent of the population has a value of a variable is below that individual’s value of the variable. For example, to say that a 4’5″ 8 year-old boy is in the 86th percentile for height for his age, we are saying: gather together all 8 year-old boys on Earth, and measure their heights; if you sort out all the 8 year-old boys who have a height less than 4’5″, they will comprise approximately 86% of the population. That boy is taller than 86% of other boys his age – that means he’s in the 86% percentile.

Percentiles is a relatively unlikely topic to see on the GRE, but if it does show up, here are a few handy facts to have up your sleeve.

A few details to clarify. The individual with the lowest value of the variable, with the minimum value, is not bigger than anyone, so the lowest percentile, the percentile of the rock-bottom minimum, is the 0th percentile. If my score is in the 0th percentile, then I am not higher than anyone.

What’s trickier is the maximum score. If my score is the highest score, I am higher than everybody else, but that’s ** not** the 100th percentile, because in order to be higher than 100% of the population, higher than everyone, I would have to have a score higher than my own score: a paradox! In fact, for this very reason, there’s no such thing as a 100th percentile. The person with the highest score is higher than everybody else, but not higher than herself, so she’s in the 99th percentile. If we are sticking with whole numbers, the 99th percentile is the highest possible percentile. If we go to decimals, we can get higher with the 99.9th percentile (1 out of a 1000), the 99.99th percentile (1 out of 10000), etc.

The median is the middle of a list: the median divides a list into an “upper half” and a “lower half.” This means, the median is higher than the lower half of the population, higher than 50%, so the median is the 50th percentile. Now, we have to be careful here. On a list with only three members — e.g. {2, 4, 7} — the median is the middle number, here 4, but that number is higher than only one number out of three — so 4 is the 33rd percentile of that list. In a technical sense, the median is not always the 50th percentile.

In some sense, though, that’s a specious objection. When there are only 3 members on a list, nobody in their right mind talks about percentiles. When the total number is less than a few hundred, there’s seldom talk of a percentile. Percentiles, by their very nature, are a way to make sense of tens of thousands, even millions of individuals. How many 8 year-old boys are there on Earth? Who knows, but it’s certainly a very very large number. That’s where percentiles are used in practice.

When the number of folks in the group is that large, then for all intents and purposes,the median is the 50th percentile. If you are familiar with the idea of quartiles, then the first quartile is the 25th percentile and the third quartile is the 75th percentile, again, when the group sizes are truly huge.

1) Sasha took a nationwide standardized test that is graded on a scale from 20 to 60. Sasha got one of the best scores recorded on that this test.

**Column A Column B**

Sasha’s score the percentile of Sasha’s score

(A) The quantity in Column A is greater.

(B) The quantity in Column B is greater.

(C) The two quantities are equal.

(D) The relationship cannot be determined from the information given.

2) Alice took nationwide standardize test that is graded on a scale from 0 to 100. Alice scored the highest score recorded on this test.

**Column A Column B**

Alice’s score the percentile of Alice’s score

(A) The quantity in Column A is greater.

(B) The quantity in Column B is greater.

(C) The two quantities are equal.

(D) The relationship cannot be determined from the information given.

3) A large distribution of score is normally distributed

**Column A**

score that’s one standard deviation above the mean

**Column B**

score that has the 80th percentile

(A) The quantity in Column A is greater.

(B) The quantity in Column B is greater.

(C) The two quantities are equal.

(D) The relationship cannot be determined from the information given.

(1) **B**; (2) **D**; (3) **A**;

1) We know that Sasha is near the top of the scoring distribution, so that would mean a score with a percentile close to the 99th percentile. Because of the scoring scale, the score is not going to be above 60, so the percentile is clearly bigger. Answer = **B**.

2) Alice got the highest score, so by definition, that’s the 99th percentile. What we don’t know is: how hard was this test? What score was the highest score? If it was a particularly challenging test, it could be that the highest score anyone achieved was only, say, a 73. In that case, the percentile would be greater. If, on the other hand, it was possible to get a perfect score, and Alice did in fact do that, then her score of a 100 would be greater than the percentile. We don’t have enough information to decide. Answer = **D**.

3) Here, it might be helpful to brush up on Normal Distribution. On a normal distribution, it’s always true that 68% of the populations lies within one standard deviation of the mean. That means, half of that, 34%, lie between the mean and one standard deviation above the mean. The score that is one standard deviation is higher than the 34% between the mean and one standard deviation above the mean, as well as than the 50% below the mean. That means, a score that lies one standard deviation above the mean is the 50 + 34 = 84th percentile. Thus, it’s higher than a score in the 80th percentile. Answer = **A**.

The post GRE Math: Percentiles and Quartiles appeared first on Magoosh GRE Blog.

]]>The post Standard Deviation on the New GRE appeared first on Magoosh GRE Blog.

]]>Many quake in their boots when they hear that there will be Statistics covered on the GRE. They run to their college stats textbooks, dust off the cover, roll up their sleeves, and start computing the standard deviations of a list of twenty, three-digit numbers. Stop, if this in anyway describes you.

The Statistics on the GRE is much simpler, and does not test your aptitude at crunching numbers as much as it does your ability to think about Statistics. That is you will rely more on intuition than computation on statistics questions on the GRE. You shouldn’t be so worried about how many statistics questions there are on the GRE, anyway.

To illustrate take a look at the following question.

1- The standard deviation on a test was 12 points, and the mean was 70. If the scores fell along a normal distribution and student X scored 95 points, then student X scored higher than approximately what percent of students?

- 2%
- 13%
- 48%
- 96%
- 98%

Answering this question correctly requires understanding standard distribution (that refers to the distribution of scores along the familiar bell-curve). To understand how standard deviation relates to the bell-curve take a look below:

Within 1 Standard Deviation Above the Mean= 34%

Within 1 Standard Deviation Below the Mean= 34%

Between 1 and 2 Standard Deviations Above the Mean = 13.5%

Between 1 and 2 Standard Deviations Below the Mean= 13.5%

Between 2 and 3 Standard Deviations Above the Mean = 2%

Between 2 and 3 Standard Deviations Below the Mean= 2%

In the problem above, 34% of students scored between 70 and 82. Likewise, 34% of students scored between 58 and 70. This symmetry is very important, and you will notice that the bell curve is symmetrical (or even) on both sides. So, given a large enough sample size, the number of students who scored three standard deviations below the average of 70 (34) is the same as the number who scored three standard deviations above the average (106).

Returning to the actual question, we want to find how many standard deviations above the average a score 95 of points is: 95 – 70 = 25, which is a tiny bit more than two standard deviations. The question is asking for an approximation, so we can round down 25 to 24.

Looking at table above, we can see that two standard deviations above the norm is better than 34% + 13.5%. The trick here is to not forget to account for the left side of the bell-curve, which is 50% (after all, half the score are on the left side and the other half on the right side—don’t forget the symmetry of the bell-curve).

That gives us a total of 50% + 47.5 = 97.5, which approximates to (E) 98%.

Let’s try another problem.

2. The reaction time of 1000 Rhesus monkeys was measured. The average time it took the monkeys to respond to a quickly moving object in their visual fields was .135 seconds, with a standard deviation of .021 seconds (assume a normal distribution). If one of the geriatric monkeys had a reaction time of .205 seconds, then that monkey’s reaction time is how many standard deviations from the mean?

- 0 – 1 standard deviations
- 1 – 2 standard deviations
- 2 – 3 standard deviations
- 3 – 4 standard deviations
- 4 – 5 standard deviations

This is exactly the sort of daunting problem that the GRE likes to throw at you. Believe it or not, there is very little math involved. Again, you want to rely on intuition more than math.

.205 – .135 = .07. If the standard deviation is .021, we can determine the number of standard deviations the monkey’s reaction time is from the mean: .07/.021, which equals approximately 3.4. Therefore (D) – the geriatric monkey’s reaction time is 3 – 4 standard deviations from the mean.

To do well on statistics questions on the GRE, you have to rely more on intuition than on number crunching. Having a strong sense of standard distribution and how standard deviation relates to standard distribution will help you immeasurably.

The post Standard Deviation on the New GRE appeared first on Magoosh GRE Blog.

]]>The post GRE Quartiles and the Interquartile Range appeared first on Magoosh GRE Blog.

]]>Statisticians point out that it’s often useful to “chunk” data to understand it. What does it mean to “chunk” data? It means dividing a long list into smaller chunks so that, with a few well-chosen numbers, we can get a sense of the layout of the list.

The fundamental “chunking” number is the median. The median is the middle of the list: that is, it divides the list into two chunks: an upper list and a lower list. This one number, the median, tells you both the maximum of the lower list and the minimum of the upper list.

Quartiles extend this idea. First, find the median, which divides the entire list into a “top 50%” list and a “bottom 50%.” Now, find the medians of each one of these lists. The median of the “bottom 50%” called Q_{1}, the **first quartile**. The median of the “top 50%” is called the **third quartile**. The quartiles are called “quartiles” because the two quartiles and the median nicely divide the list into four equal chunks.

- the lowest 25% of the list is below the first quartile
- the next 25% of the list is between the first quartile and the median
- the next 25% of the list is between the median and third quartile
- the highest 25% is above the third quartile.

Notice that, we don’t use the term “second quartile” because the median plays the role of the second quartile.

Often, statisticians are bothered by outliers, that is, extreme high or low values. An outlier is a member on the list who is not representative of most of the list. In the list of household incomes in the US, the incomes of Bill Gates and Warren Buffett are not representative of the rest of us: they are outliers. Outliers, by definition, will always be at the very top or the very bottom of a list.

Notice that both the “top 50%” and the “bottom 50%” will necessarily contain any outliers. Would it be possible to talk about a “half” of the population that definitely contains no outliers? Well, instead of the “top 50%” or the “bottom 50%”, we could take the “**middle 50%**“. What’s that? Well, suppose we look at all the folks between the first quartile and third quartile. We know that a quarter of the population is between the first quartile and the median, and a quarter between the median and the third quartile, so between the first quartile and the third quartile is 50% of the population, and it’s the 50% that’s in the middle of the population. This is called the **interquartile range**: the set of data entries from the first quartile to the third quartile. It’s a big deal because it’s not the upper half or lower half but rather the **middle half** of the data. For this reason, statisticians feel it gives a very good representation where the typical data lie.

Consider the geographic size of countries. On planet Earth, what is the size of a typical country? Well, if we list the countries and their areas, we find the maximum is Russia (16,995,800 sq km) and the minimum is the Holy See (0.44 sq km). Obviously, neither one of those is typical of the area of a country.

The median value on the list is 50660 sq km (Costa Rica). So that’s interesting: half the countries on Earth have more area than Costa Rica, and half have less. Incidentally, the US State of West Virginia is slightly bigger than this, so little old West Virginia has more area than half the countries on Earth. Who would have thought that? 🙂

The third quartile is 325360 sq km (Vietnam) and the first quartile is 572 sq km (the Isle of Man). So, even within the interquartile range, there’s huge variation from 572 up to 325360. Still, we can say half the countries on Earth have more area than the Isle of Man but less area than Vietnam. That’s where the middle 50% lies. That would be, in many ways, the most representative range for the size of a “typical” country.

The post GRE Quartiles and the Interquartile Range appeared first on Magoosh GRE Blog.

]]>The post Normal Distribution on the GRE appeared first on Magoosh GRE Blog.

]]>A distribution is a graph that shows what values of variable are more or less common in a population. Where the graph is higher, there are more people, and where the graph has a height close to zero, there are fewer people.

By far, the most famous and most useful distribution is the Normal Distribution, a.k.a the “bell curve.” It shows up *everywhere*, with an almost eerie universality. Suppose you were to measure one genetically determined bodily measurement (e.g. thumb length, distance between pupils, etc.) for every single human being on the planet, and then graphed the distribution: it would be a normal distribution. Same, for any genetically determined bodily measurement you could make on an animal or a plant, and measured it for every member of that species, it would be a normal distribution. The normal distribution is the shape of the distribution of any naturally occurring variable of any natural population. (Something like blood pressure might not be as normally distributed, because there are cultural and social factors that impinge on blood pressure – it’s not purely natural, unadulterated by culture.)

All normal distributions on earth, from giraffe height to ant height, share certain fundamental properties in common.

It’s important to appreciate that any Normal Distribution comes with its own “yardstick”, and that yardstick is the standard deviation. You can read more about standard deviation here. The very center of the Normal Distribution is the mean and median and mode all in one. We use the standard deviation to measure distances from the mean. If we go out a length of one standard deviation from the mean on either side,

that always includes 68% of the population, a little over two-thirds. This means that on either side, there is 34% of the population, very close to one-third: there’s 34% between the mean and one standard deviation below the mean, and there’s another 34% between the mean and one standard deviation above the mean.

If we go two standard deviations from the mean in either direction,

that always includes 95% of the population. You are somewhat uncommon if you are more than two standard deviations from the mean.

If we go out to three standard deviations from the mean in either direction, that includes 99.7% of the population, with only 0.15% (i.e. 15 people out of 10000) falling in each tail beyond this. The folks who are more than three standard deviation above the mean: they are the true outliers — the major league baseball hitters, the world famous violinists, the brilliant scientists and researchers — they truly stand out from the population at large.

If you simply remember these two numbers:

**68%**within one standard deviation of the mean (which means, 34% on each side)**95%**within two standard deviations of the mean

then will have the ability figure out any GRE Math question that address the Normal Distribution.

The post Normal Distribution on the GRE appeared first on Magoosh GRE Blog.

]]>