# GMAT Critical Reasoning: Populations

To begin, a couple GMAT CR questions, variations on a theme.

1) Colfax Beta-80 is a rare genetic defect found primarily in people of Scandinavian descent.  Over 97% of known carriers of this defect have are citizens of, or are direct descendants of immigrants from, Denmark, Norway, and Sweden.  People who carry the Colfax Beta-80 defect are at substantially higher risk for contracting Lupus and related autoimmune diseases.

Assuming the statements above are true, which of the following can be inferred from them?

(A) People from Denmark are at a higher risk for Lupus than people of other, non-Scandinavian countries.
(B) Genetic engineering that eradicated this genetic defect would constitute a de facto cure for Lupus.
(C) Finding a cure for Lupus would eliminate most of the health threats associated with the Colfax Beta-80 defect.
(D) A person not of Scandinavian descent born with the Colfax Beta-80 defect is more likely to contract Lupus than is a Scandinavian who is born without this defect.

(E) The majority of people who contract Lupus are either Scandinavian or of Scandinavian descent.

2) In social science research, “highest education level attained” would refer to the most advanced grade or degree achieved by an individual — for some individuals, it may be a grade in grade school, and for other individuals, it may be a Bachelor’s Degree, a Master’s Degree, or Ph.D. (which is considered the highest education level). A recent study has shown a strong correlation between highest education level attained and proficiency in chess.  Another result, studied at many points throughout the 20th century, shows a marked positive correlation between highest education level attained and income level.

Assuming the statements above are true, what conclusion can be drawn from them?

(A) If one practices chess enough to raise one’s proficiency, one has a good chance of raising one’s income level.
(B) It is possible that a person who has attained only a sixth grade level of education could earn more than a person who has a Ph. D.
(C) If Jane has a Ph. D., and Chris has not finished his undergraduate degree, then Jane will usually beat Chris in chess.
(D) The average salary for people who have completed three-year Master’s Programs is higher than the average salary of people who have completed two-year Master’s Programs.
(E) An individual’s proficiency at chess rises consistently during that individual’s years of school, and levels off once that individual has finished her years of formal education.

## Reasoning with populations

Folks who have not studied statistic tend to fall into some likely mistakes when thinking about issues involving correlation.  After all, correlation is a very sophisticated idea.  Most educated people have heard this word and have a vague idea (e.g. if A goes up, B goes up), but this vague popular understanding lends itself to some obvious misunderstandings that the GMAT loves to exploit.

## Mistake #1: Correlation & Causality

Any veteran of Statistics has probably heard the mantra: Correlation does not imply causality.  This is tricky, because of course, the inverse is true: causality does, in fact, imply correlation.  If A reliably causes B, then whenever you find A, you will be likely to find B.  For example, smoking causes a large collection of undesirable conditions, including lung cancer, emphysema, and heart disease, and sure enough, it is highly correlated with each of these.

The catch, though, is that two things can be correlated and A does not cause B.  For example, A & B would be highly correlated if they were the common response to the same underlying cause: for example, beer sales and ice cream sales are highly correlated, not because folks like having beer a la mode, but because another cause, hot weather, drives both.   There are other more complicated relationships we will not explore here in which A & B would tend to show up together — that is, they would be correlated — but each would not be a relationship in which one is causing the other.

Another way to say this is: correlation is relatively easy to demonstrate.  All you need is broad sociological or epidemiological data, and you can show correlation. Anyone with a data set and statistical software can demonstrate correlation.  By contrast, demonstrating causality is often a major scientific achievement, sometimes worthy of a Nobel Prize.   To demonstrate that A causes B, one would need to show dozens and dozens of conditions are met, only the most elementary of which is that A is correlated with B.

In any question about correlation, the GMAT loves incorrect answers that blur the distinction between correlation and causality.

## Mistake #2: The Problem of Scope

A correlation is something that exists across a whole population.  In the natural sciences, and especially in the physical sciences, one can get extremely tight correlations, such that all the data points line exclusively on a straight line.   In that case, the correlation is true not only at the population level, but also at the level of individual points — one point is higher in A, that point must be higher in B.

When the GMAT talks about correlation, mostly this will not be in the context of the natural sciences.  Instead, it will be in context of the social sciences.  Human populations are messy.   There’s always a ton of random fluctuation involved in anything you measure about people, and this makes the social science considerably less precise than the natural sciences.  A correlation in the social sciences is something that’s true in a population-wide view, but when the scope shifts to individual-to-individual comparisons, the statistical noise is too great to discern any reiable pattern.

For example, one well measured social science study demonstrated the correlation of income and height.   If one steps back and looks at the whole population, one can discern a mild relationship — on average, taller people are slightly more likely to have higher salaries than are shorter people.  At the level of whole populations, at the level of probabilities, this relationship holds.  Now, switch to the individual level.  It’s sheer nonsense to say that, if Alex is taller than Bert, than Alex must be richer than Bert.  It’s trivially easy to find single examples of poor tall people and rich short people.   The correlation is something that is true in the population-view, but at the level of individuals, it’s virtually meaningless, except as a very weak probability statement.   Folks not familiar with statistics forget this, and get almost “fundamentalist” in their interpretation of correlation, as if the fact that A is correlated with B means that in every single instance that A goes up, it absolutely must be true that B goes up.  The GMAT loves to prey on this kind of misconception.

## Summary

If reading this post gave you any insights into the nature of correlation, you might give those questions at the top another look before reading the explanations below.  If you would like to share any insights or ask a question, let us know in the comment section at the bottom!

## Solutions to the Practice Questions

1) The prompt tells us that Colfax Beta-80 is a genetic defect.  Most of the folks who have this defect are Scandinavian, but we don’t know what percent of Scandinavians have this defect.  It may be a substantial portion, but that’s unlikely because the defect is “rare.” Much more likely: there only be a couple hundred people in the whole world who have this defect, and 97% of this couple hundred are from Scandinavia —- a large percentage among those with the defect, but not a large percentage among the Scandinavian population as a whole.  A couple answer choices conflate these two percentages.

Anyone with this defect is at higher risk for Lupus and other autoimmune diseases.

(D) is the credited answer.   This more or less restates the information of the last sentence.  Anyone with this defect (Scandinavian or not) has a substantially higher risk of Lupus compared to anyone without the defect (Scandinavian or not).

(A) & (E) play on the misunderstanding about what the 97% implies.  Most folks with this genetic defect are Scandinavian, but that doesn’t imply that most Scandinavian people have this defect.   People with the defect are at higher risk for Lupus, but that doesn’t mean large sections of the Scandinavian population are at risk for Lupus.

(B) is wrong because, while we are told this genetic defect causes susceptibility to Lupus, we don’t know what other factors might cause or contribute to Lupus.  Just because we eliminate this one factor does not mean we would eliminate everything in the world that could possibly contribute to the onset of Lupus.

(C) is wrong because, while we are told this genetic defect causes Lupus, we are also told it causes other autoimmune diseases.  Even if we had a cure for Lupus, these other autoimmune diseases would still poses health threats to carriers of the defect.

2) This question presents two correlations, education level with chess, and education level with income.  We would do well to remember both errors mentioned above.

(B) is the credited answer.   In the population view, higher education level is correlated, on average, with higher income, but this doesn’t apply at the individual level.   Indeed, despite the overall population pattern, it would certainly be possible to find someone with a sixth-grade education who struck a fortune and therefore was richer than many people with Ph.D.’s.  It wouldn’t be likely, if we picked a random person with a sixth-grade education and a random Ph.D., but it would be possible.

(A) plays on the correlation-causality fallacy.  Chess is correlated with education level, but doesn’t “cause” education level.  Education level is correlated with income, but doesn’t singlehandedly “cause” income.  There is no reason to conclude what (A) says.

(C) plays on the fallacy of scope.  Yes, there’s a correlation in the overall population, but just because Jane has a Ph.D. and Chris doesn’t even have an B.A., we can’t automatically assume that Jane is better at chess.

(D) is tricky.  The “education level” variable implied the idea of “length of time being educated”, but that’s not explicitly part of the variable.  The question very clearly says one of the last three categories is “Master’s Degree”, so all master’s degree would fall into this category, irrespective of the duration of the program.

(E) also plays on the correlation-causality fallacy.  In general, folks who are more proficient at chess are more likely to pursue higher degrees, but it’s not that step-by-step in their year-by-year learning process, they are steadily learning more about chess.  In other words, the education does not strictly “cause” the proficiency in chess.

### 25 Responses to GMAT Critical Reasoning: Populations

1. Subham May 17, 2016 at 1:25 am #

Hi Mike,

I have a query regarding ques 2,

A recent study has shown a strong correlation between highest education level attained and proficiency in chess.

Strong Correlation can either be positive or negative.
Am I correct?

Thanks

• Magoosh Test Prep Expert May 18, 2016 at 11:39 am #

Hi Subham,

You’re correct that a strong correlation can either be positive or negative. A positive correlation refers to when two variables increase together. Conversely, a negative correlation describes the situation in which one variable increases while the other decreases. When a set of data is plotted, a trend line with a positive slope indicates a positive correlation, while a trend line with a negative slope indicates a negative correlation. In either case, the closer the data points are to the trend line, the stronger the correlation is said to be.

Hope that helps 🙂

2. Prashant March 24, 2016 at 8:01 am #

Hi Mike,
Critical Reasoning gets really frustrating for me and i am struggling with it. This explanation about the difference between co-relation and causality was really helpful. I was able to solve the two questions by myself just after reading your explanation of co-relation and this was the first time i got a CR question correct through actual reasoning instead of guessing. Thanks for the detailed blog. Looking forward to more blogs like these.
Prashant

3. Jeremy July 17, 2015 at 6:18 pm #

Hi there, thanks for the great post!
I’m not sure about the second question though.
The answer doesn’t follow the correlation that is mentioned in the passage. Actually it doesn’t seem to reflect anything what’s been mentioned in the passage I think – the passage mentions two types of correlations and the answer provides an instance that goes against it. However the explanation here, not in the passage, reminds us to remember the ‘population view doesn’t necessarily apply to the individual view’.

So was it a necessary to have “Mistake #2: The Problem of Scope” knowledge in mind before solving the problem?

4. Adnan June 21, 2015 at 7:52 am #

Hi

I have a doubt regarding the first question:

Choice E states that of the people who contract Lupus majority are either Scandinavians or of Scandinavian descent which I guess is supported by the argument, which states that 97% of the people having the defect are Scandinavians.

If the option were Majority of the Scandinavians have Lupus then probably your reasoning would have been right and option E wrong.

But here E seems to be a legitimately correct answer to me.

Same for Option A too, where it compares people of Scandinavian and non -Scandinavian descent. It does not say that people of Denmark are at highest risk of contracting lupus as compared to other diseases.
So A. seems right too.

Thanks

5. Amit October 10, 2014 at 11:03 pm #

Hi Mike ,

I have a query regarding the second question in this blog.
Had the option D been like this:

The average salary for people who have completed PhD is higher than the average salary of people who have completed Master’s Programs.

would that be correct ?

Regards,
Amit

• Mike October 12, 2014 at 3:28 pm #

Dear Amit,
That’s a great question, and I am happy to respond. 🙂 That statement would most likely be true, we would have very good reason to suspect that it is true, BUT technically, we don’t know for a fact that it is true. It could be that Masters & PhD program grads have about the same salary, which is much higher than the salaries of the undereducated folks: that would be enough to account for a correlation. The correlation implies a general discernible linear pattern in the population data over all, but it doesn’t necessarily hold from group to group.
BTW, that statement, while technically not guaranteed to be true, would be far too close to true to be a wrong answer on the GMAT. The statement you proposed gets into the kind of hyper-technical distinctions one might study in advanced statistics, but the GMAT does not expect you to have that kind of knowledge.
Does all this make sense?
Mike 🙂

6. mahamamd January 2, 2014 at 9:12 pm #

Dear Mike, u remember the reasoning below: it’s from one of the practice test: inference question. I have no contention with the ans choice (A) except for one: the text specifies–vaccinating all of the citizens of this state for Tacitus’s disease, but answer says young children of the state will be at risk of…… I guess so far it should be–citizens will be at risk…..young children instead.

Public Health Official: After several years of vaccinating all of the citizens of this state for Tacitus’ Disease, a highly infectious virus, state hospitals have cut costs by no longer administering this vaccine, starting at the beginning of this year. A state senator defended the position, arguing that after several years with zero incidence of the disease in the state, its citizens were no longer at risk. This is a flawed argument. Our state imports meats and produce from countries with high incidences of diseases for which our country has vaccines. Three years ago, when we reduced the use of the Salicetiococcus vaccines, a small outbreak of Salicetiococcus among young children, fortunately without fatalities, encouraged us to resume use of the previous vaccines.

The public health official’s statements, if true, best support which of the following as a conclusion?

(A)Young children of the state will be at risk for Tacitus’ Disease.

If u write some explanation for that, I will really appreciate that.

• Mike January 3, 2014 at 7:33 am #

Dear Mahamamd,
That question is NOT from one of the practice tests. It is a Magoosh question that I wrote for this blog. You can find the full explanation here:
https://magoosh.com/gmat/2013/gmat-critical-reasoning-find-the-conclusion-or-inference/
Let me know there if you have any further questions about it.
Mike 🙂

• mahamamd January 3, 2014 at 3:50 pm #

Thank u a lot, Mike. Actually I mean that question is from critical reasoning practice session. My question is still the same.

I have no contention with the ans choice (A) except for one: the text specifies–vaccinating all of the citizens of this state for Tacitus’s disease, but answer choice (A) says young children of the state will be at risk of…… I guess so far it should be–citizens will be at risk…..(not young children), because “outbreak of Salicetiococcus in young children” is used as an analogy.

• Mike January 3, 2014 at 5:04 pm #

Dear Mahamamd,
If the vaccine had been administered for years, until recently, then everyone who was born more than a couple years ago would have received the vaccine and would therefore be immune to the Tacitus’s disease. The only folks who would be vulnerable would be the folks born since authorities discontinued the vaccine — that would only be young children born in the past year or so. That’s why (A) concerns only young children, not citizens of other ages.
Mike 🙂

7. Mahammad December 23, 2013 at 6:20 pm #

Hello Mike,

It’s a great post.

Can You just make sure the ans of question # 1 is D or E. I guess the ans is E.
Your explanation also says it’s E. I guess the answer choice D is not correct because the prompt does not say anything about non-scandinavian people. Since it’s a inferred question can we inferred something that is not stated in the prompt?

• Mike December 23, 2013 at 7:06 pm #

As I discussed above, for question #1, choice (E) is one of the common fallacies, one of the common traps for this type of question. It is definitely not correct.
The correct answer is (D), as stated in the solution. Don’t be so literalist about this idea of “not stated in the prompt”. The prompt says: “People who carry the Colfax Beta-80 defect are at substantially higher risk for contracting Lupus and related autoimmune diseases.” In other words, “People with the defect are more likely to get Lupus than people without the defect.” That’s clearly a valid inference. All choice (D) does is replace the general word “people” with particular groups of people. This is a detail change, not a logical change. Logically, the sentence still says the same thing, and is still a valid inference, whether we talk about “people” in general or about particular groups of people. The GMAT loves to make detail changes that don’t change the logic, because people get attached to the detail. If you remain too literalist with your “not stated in the prompt” rule, the GMAT will trick you time and time again simply by re-arranging the details without change the logic.
Does this make sense?
Mike 🙂

• mahamamd December 24, 2013 at 4:40 pm #

this explanation is quite obvious; I really appreciate that.

• Mike December 24, 2013 at 5:16 pm #

Dear Mahamamd,
Mike 🙂

8. Ankit October 10, 2013 at 10:51 pm #

Mike,

Just found your posts from somewhere and I am really glad that I found them because they are like life saver for me. Keep doing up the good work and really thank you very much.

• Mike October 11, 2013 at 9:52 am #

Dear Ankit,
Thank you for your kind words. I’m glad you found this helpful. Best of luck to you!
Mike 🙂

9. trois_couleurs July 29, 2013 at 3:18 am #

I’m ready to write “thank you” for each article 😀

• Mike July 29, 2013 at 9:40 am #

Dear “Tri-Colors”,
I appreciate your gratitude. Best of luck, my friend.
Mike 🙂

10. Shailendra June 7, 2013 at 5:38 am #

A good way to say —

Correlation implies possibility and is not certainty.

Any answer choice that talks about possibility (keywords such as: probable, might, can) is more likely to be correct than answer choices talks certainty (keywords such as: will, are).

Be cautious – I just made another correlation above 😉

• Mike June 7, 2013 at 9:57 am #

Dear Shailendra,
Great point — “possible” statements are far more likely to be true than statements of certainty. BTW, your *correlation* reminds me of
http://xkcd.com/552/
Mike 🙂

• Shailendra June 7, 2013 at 10:30 am #

>> BTW, your *correlation* reminds me of http://xkcd.com/552/

Not “may be” but certainly – your blog has helped with full certainty.

• Mike June 7, 2013 at 12:29 pm #

Why, thank you very much for your kind words.
Mike 🙂

11. Vishnu Suresh June 5, 2013 at 11:11 am #

Mike,

This was a very interesting article indeed! For someone beginning to brush off years of rusty-unused math/verbal skills, these articles are a godsend. Thank you! I lookforward to many more articles :).

Regards,
Vishnu

• Mike June 5, 2013 at 11:21 am #

Dear Vishnu,
I’m very glad you found it helpful. Thank you for your kind words, and best of luck to you.
Mike 🙂

Magoosh blog comment policy: To create the best experience for our readers, we will only approve comments that are relevant to the article, general enough to be helpful to other students, concise, and well-written! 😄 Due to the high volume of comments across all of our blogs, we cannot promise that all comments will receive responses from our instructors.

We highly encourage students to help each other out and respond to other students' comments if you can!

If you are a Premium Magoosh student and would like more personalized service from our instructors, you can use the Help tab on the Magoosh dashboard. Thanks!