GMAT Critical Reasoning: Populations

June 5, 2013

To begin, a couple GMAT CR questions, variations on a theme.

1) Colfax Beta-80 is a rare genetic defect found primarily in people of Scandinavian descent. Over 97% of known carriers of this defect have are citizens of, or are direct descendants of immigrants from, Denmark, Norway, and Sweden. People who carry the Colfax Beta-80 defect are at substantially higher risk for contracting Lupus and related autoimmune diseases.

Assuming the statements above are true, which of the following can be inferred from them?

de facto

(E) The majority of people who contract Lupus are either Scandinavian or of Scandinavian descent.

2) In social science research, “highest education level attained” would refer to the most advanced grade or degree achieved by an individual — for some individuals, it may be a grade in grade school, and for other individuals, it may be a Bachelor’s Degree, a Master’s Degree, or Ph.D. (which is considered the highest education level). A recent study has shown a strong correlation between highest education level attained and proficiency in chess. Another result, studied at many points throughout the 20th century, shows a marked positive correlation between highest education level attained and income level.

Assuming the statements above are true, what conclusion can be drawn from them?

Reasoning with populations

Folks who have not studied statistic tend to fall into some likely mistakes when thinking about issues involving correlation. After all, correlation is a very sophisticated idea. Most educated people have heard this word and have a vague idea (e.g. if A goes up, B goes up), but this vague popular understanding lends itself to some obvious misunderstandings that the GMAT loves to exploit.

Mistake #1: Correlation & Causality

Any veteran of Statistics has probably heard the mantra: Correlation does not imply causality. This is tricky, because of course, the inverse is true: causality does, in fact, imply correlation. If A reliably causes B, then whenever you find A, you will be likely to find B. For example, smoking causes a large collection of undesirable conditions, including lung cancer, emphysema, and heart disease, and sure enough, it is highly correlated with each of these.

The catch, though, is that two things can be correlated and A does not cause B. For example, A & B would be highly correlated if they were the common response to the same underlying cause: for example, beer sales and ice cream sales are highly correlated, not because folks like having beer a la mode, but because another cause, hot weather, drives both. There are other more complicated relationships we will not explore here in which A & B would tend to show up together — that is, they would be correlated — but each would not be a relationship in which one is causing the other.

Another way to say this is: correlation is relatively easy to demonstrate. All you need is broad sociological or epidemiological data, and you can show correlation. Anyone with a data set and statistical software can demonstrate correlation. By contrast, demonstrating causality is often a major scientific achievement, sometimes worthy of a Nobel Prize. To demonstrate that A causes B, one would need to show dozens and dozens of conditions are met, only the most elementary of which is that A is correlated with B.

In any question about correlation, the GMAT loves incorrect answers that blur the distinction between correlation and causality.

Mistake #2: The Problem of Scope

A correlation is something that exists across a whole population. In the natural sciences, and especially in the physical sciences, one can get extremely tight correlations, such that all the data points line exclusively on a straight line. In that case, the correlation is true not only at the population level, but also at the level of individual points — one point is higher in A, that point must be higher in B.

When the GMAT talks about correlation, mostly this will not be in the context of the natural sciences. Instead, it will be in context of the social sciences. Human populations are messy. There’s always a ton of random fluctuation involved in anything you measure about people, and this makes the social science considerably less precise than the natural sciences. A correlation in the social sciences is something that’s true in a population-wide view, but when the scope shifts to individual-to-individual comparisons, the statistical noise is too great to discern any reiable pattern.

For example, one well measured social science study demonstrated the correlation of income and height. If one steps back and looks at the whole population, one can discern a mild relationship — on average, taller people are slightly more likely to have higher salaries than are shorter people. At the level of whole populations, at the level of probabilities, this relationship holds. Now, switch to the individual level. It’s sheer nonsense to say that, if Alex is taller than Bert, than Alex must be richer than Bert. It’s trivially easy to find single examples of poor tall people and rich short people. The correlation is something that is true in the population-view, but at the level of individuals, it’s virtually meaningless, except as a very weak probability statement. Folks not familiar with statistics forget this, and get almost “fundamentalist” in their interpretation of correlation, as if the fact that A is correlated with B means that in every single instance that A goes up, it absolutely must be true that B goes up. The GMAT loves to prey on this kind of misconception.

Summary

If reading this post gave you any insights into the nature of correlation, you might give those questions at the top another look before reading the explanations below. If you would like to share any insights or ask a question, let us know in the comment section at the bottom!

Solutions to the Practice Questions

1) The prompt tells us that Colfax Beta-80 is a genetic defect. Most of the folks who have this defect are Scandinavian, but we don’t know what percent of Scandinavians have this defect. It may be a substantial portion, but that’s unlikely because the defect is “rare.” Much more likely: there only be a couple hundred people in the whole world who have this defect, and 97% of this couple hundred are from Scandinavia —- a large percentage among those with the defect, but not a large percentage among the Scandinavian population as a whole. A couple answer choices conflate these two percentages.

Anyone with this defect is at higher risk for Lupus and other autoimmune diseases.

(D) is the credited answer. This more or less restates the information of the last sentence. Anyone with this defect (Scandinavian or not) has a substantially higher risk of Lupus compared to anyone without the defect (Scandinavian or not).

(A) & (E) play on the misunderstanding about what the 97% implies. Most folks with this genetic defect are Scandinavian, but that doesn’t imply that most Scandinavian people have this defect. People with the defect are at higher risk for Lupus, but that doesn’t mean large sections of the Scandinavian population are at risk for Lupus.

(B) is wrong because, while we are told this genetic defect causes susceptibility to Lupus, we don’t know what other factors might cause or contribute to Lupus. Just because we eliminate this one factor does not mean we would eliminate everything in the world that could possibly contribute to the onset of Lupus.

(C) is wrong because, while we are told this genetic defect causes Lupus, we are also told it causes other autoimmune diseases. Even if we had a cure for Lupus, these other autoimmune diseases would still poses health threats to carriers of the defect.

2) This question presents two correlations, education level with chess, and education level with income. We would do well to remember both errors mentioned above.

(B) is the credited answer. In the population view, higher education level is correlated, on average, with higher income, but this doesn’t apply at the individual level. Indeed, despite the overall population pattern, it would certainly be possible to find someone with a sixth-grade education who struck a fortune and therefore was richer than many people with Ph.D.’s. It wouldn’t be likely, if we picked a random person with a sixth-grade education and a random Ph.D., but it would be possible.

(A) plays on the correlation-causality fallacy. Chess is correlated with education level, but doesn’t “cause” education level. Education level is correlated with income, but doesn’t singlehandedly “cause” income. There is no reason to conclude what (A) says.

(C) plays on the fallacy of scope. Yes, there’s a correlation in the overall population, but just because Jane has a Ph.D. and Chris doesn’t even have an B.A., we can’t automatically assume that Jane is better at chess.

(D) is tricky. The “education level” variable implied the idea of “length of time being educated”, but that’s not explicitly part of the variable. The question very clearly says one of the last three categories is “Master’s Degree”, so all master’s degree would fall into this category, irrespective of the duration of the program.

(E) also plays on the correlation-causality fallacy. In general, folks who are more proficient at chess are more likely to pursue higher degrees, but it’s not that step-by-step in their year-by-year learning process, they are steadily learning more about chess. In other words, the education does not strictly “cause” the proficiency in chess.

Author

Mike MᶜGarry

Mike served as a GMAT Expert at Magoosh, helping create hundreds of lesson videos and practice questions to help guide GMAT students to success. He was also featured as “member of the month” for over two years at GMAT Club. Mike holds an A.B. in Physics (graduating magna cum laude) and an M.T.S. in Religions of the World, both from Harvard. Beyond standardized testing, Mike has over 20 years of both private and public high school teaching experience specializing in math and physics. In his free time, Mike likes smashing foosballs into orbit, and despite having no obvious cranial deficiency, he insists on rooting for the NY Mets. Learn more about the GMAT through Mike’s Youtube video explanations and resources like What is a Good GMAT Score? and the GMAT Diagnostic Test.

View all posts

Share2

Comments

25 responses to “GMAT Critical Reasoning: Populations”

Subham

May 17, 2016

Hi Mike,

I have a query regarding ques 2,

A recent study has shown a strong correlation between highest education level attained and proficiency in chess.

Strong Correlation can either be positive or negative.
Am I correct?

Thanks

Reply
1. Magoosh Test Prep Expert
  
  May 18, 2016
  
  Hi Subham,
  
  You’re correct that a strong correlation can either be positive or negative. A positive correlation refers to when two variables increase together. Conversely, a negative correlation describes the situation in which one variable increases while the other decreases. When a set of data is plotted, a trend line with a positive slope indicates a positive correlation, while a trend line with a negative slope indicates a negative correlation. In either case, the closer the data points are to the trend line, the stronger the correlation is said to be.
  
  Hope that helps 🙂
  
  Reply
Prashant

March 24, 2016

Hi Mike,
Critical Reasoning gets really frustrating for me and i am struggling with it. This explanation about the difference between co-relation and causality was really helpful. I was able to solve the two questions by myself just after reading your explanation of co-relation and this was the first time i got a CR question correct through actual reasoning instead of guessing. Thanks for the detailed blog. Looking forward to more blogs like these.
Prashant

Reply
Jeremy

July 17, 2015

Hi there, thanks for the great post!
I’m not sure about the second question though.
The answer doesn’t follow the correlation that is mentioned in the passage. Actually it doesn’t seem to reflect anything what’s been mentioned in the passage I think – the passage mentions two types of correlations and the answer provides an instance that goes against it. However the explanation here, not in the passage, reminds us to remember the ‘population view doesn’t necessarily apply to the individual view’.

So was it a necessary to have “Mistake #2: The Problem of Scope” knowledge in mind before solving the problem?

Reply
Adnan

June 21, 2015

Hi

I have a doubt regarding the first question:

Choice E states that of the people who contract Lupus majority are either Scandinavians or of Scandinavian descent which I guess is supported by the argument, which states that 97% of the people having the defect are Scandinavians.

If the option were Majority of the Scandinavians have Lupus then probably your reasoning would have been right and option E wrong.

But here E seems to be a legitimately correct answer to me.

Same for Option A too, where it compares people of Scandinavian and non -Scandinavian descent. It does not say that people of Denmark are at highest risk of contracting lupus as compared to other diseases.
So A. seems right too.

Please provide your insights
Thanks

Reply
Amit

October 10, 2014

Hi Mike ,

I have a query regarding the second question in this blog.
Had the option D been like this:

The average salary for people who have completed PhD is higher than the average salary of people who have completed Master’s Programs.

would that be correct ?

Regards,
Amit

Reply
1. Mike
  
  October 12, 2014
  
  Dear Amit,
  That’s a great question, and I am happy to respond. 🙂 That statement would most likely be true, we would have very good reason to suspect that it is true, BUT technically, we don’t know for a fact that it is true. It could be that Masters & PhD program grads have about the same salary, which is much higher than the salaries of the undereducated folks: that would be enough to account for a correlation. The correlation implies a general discernible linear pattern in the population data over all, but it doesn’t necessarily hold from group to group.
  BTW, that statement, while technically not guaranteed to be true, would be far too close to true to be a wrong answer on the GMAT. The statement you proposed gets into the kind of hyper-technical distinctions one might study in advanced statistics, but the GMAT does not expect you to have that kind of knowledge.
  Does all this make sense?
  Mike 🙂
  
  Reply
mahamamd

January 2, 2014

Dear Mike, u remember the reasoning below: it’s from one of the practice test: inference question. I have no contention with the ans choice (A) except for one: the text specifies–vaccinating all of the citizens of this state for Tacitus’s disease, but answer says young children of the state will be at risk of…… I guess so far it should be–citizens will be at risk…..young children instead.

Public Health Official: After several years of vaccinating all of the citizens of this state for Tacitus’ Disease, a highly infectious virus, state hospitals have cut costs by no longer administering this vaccine, starting at the beginning of this year. A state senator defended the position, arguing that after several years with zero incidence of the disease in the state, its citizens were no longer at risk. This is a flawed argument. Our state imports meats and produce from countries with high incidences of diseases for which our country has vaccines. Three years ago, when we reduced the use of the Salicetiococcus vaccines, a small outbreak of Salicetiococcus among young children, fortunately without fatalities, encouraged us to resume use of the previous vaccines.

The public health official’s statements, if true, best support which of the following as a conclusion?

(A)Young children of the state will be at risk for Tacitus’ Disease.

If u write some explanation for that, I will really appreciate that.

Reply
1. Mike
  
  January 3, 2014
  
  Dear Mahamamd,
  That question is NOT from one of the practice tests. It is a Magoosh question that I wrote for this blog. You can find the full explanation here:
  https://magoosh.com/gmat/2013/gmat-critical-reasoning-find-the-conclusion-or-inference/
  Let me know there if you have any further questions about it.
  Mike 🙂
  
  Reply
  1. mahamamd
    
    January 3, 2014
    
    Thank u a lot, Mike. Actually I mean that question is from critical reasoning practice session. My question is still the same.
    
    I have no contention with the ans choice (A) except for one: the text specifies–vaccinating all of the citizens of this state for Tacitus’s disease, but answer choice (A) says young children of the state will be at risk of…… I guess so far it should be–citizens will be at risk…..(not young children), because “outbreak of Salicetiococcus in young children” is used as an analogy.
    
    Reply
    1. Mike
      
      January 3, 2014
      
      Dear Mahamamd,
      If the vaccine had been administered for years, until recently, then everyone who was born more than a couple years ago would have received the vaccine and would therefore be immune to the Tacitus’s disease. The only folks who would be vulnerable would be the folks born since authorities discontinued the vaccine — that would only be young children born in the past year or so. That’s why (A) concerns only young children, not citizens of other ages.
      Mike 🙂
      
      Reply
Mahammad

December 23, 2013

Hello Mike,

It’s a great post.

Can You just make sure the ans of question # 1 is D or E. I guess the ans is E.
Your explanation also says it’s E. I guess the answer choice D is not correct because the prompt does not say anything about non-scandinavian people. Since it’s a inferred question can we inferred something that is not stated in the prompt?

thank u in advance.

Reply
1. Mike
  
  December 23, 2013
  
  Dear Mahammad,
  As I discussed above, for question #1, choice (E) is one of the common fallacies, one of the common traps for this type of question. It is definitely not correct.
  The correct answer is (D), as stated in the solution. Don’t be so literalist about this idea of “not stated in the prompt”. The prompt says: “People who carry the Colfax Beta-80 defect are at substantially higher risk for contracting Lupus and related autoimmune diseases.” In other words, “People with the defect are more likely to get Lupus than people without the defect.” That’s clearly a valid inference. All choice (D) does is replace the general word “people” with particular groups of people. This is a detail change, not a logical change. Logically, the sentence still says the same thing, and is still a valid inference, whether we talk about “people” in general or about particular groups of people. The GMAT loves to make detail changes that don’t change the logic, because people get attached to the detail. If you remain too literalist with your “not stated in the prompt” rule, the GMAT will trick you time and time again simply by re-arranging the details without change the logic.
  Does this make sense?
  Mike 🙂
  
  Reply
  1. mahamamd
    
    December 24, 2013
    
    this explanation is quite obvious; I really appreciate that.
    
    Reply
    1. Mike
      
      December 24, 2013
      
      Dear Mahamamd,
      You’re welcome. I’m glad you found it helpful.
      Mike 🙂
      
      Reply
Ankit

October 10, 2013

Mike,

Just found your posts from somewhere and I am really glad that I found them because they are like life saver for me. Keep doing up the good work and really thank you very much.

Reply
1. Mike
  
  October 11, 2013
  
  Dear Ankit,
  Thank you for your kind words. I’m glad you found this helpful. Best of luck to you!
  Mike 🙂
  
  Reply
trois_couleurs

July 29, 2013

I’m ready to write “thank you” for each article 😀

Reply
1. Mike
  
  July 29, 2013
  
  Dear “Tri-Colors”,
  I appreciate your gratitude. Best of luck, my friend.
  Mike 🙂
  
  Reply
Shailendra

June 7, 2013

A good way to say —

Correlation implies possibility and is not certainty.

Any answer choice that talks about possibility (keywords such as: probable, might, can) is more likely to be correct than answer choices talks certainty (keywords such as: will, are).

Be cautious – I just made another correlation above 😉

Reply
1. Mike
  
  June 7, 2013
  
  Dear Shailendra,
  Great point — “possible” statements are far more likely to be true than statements of certainty. BTW, your *correlation* reminds me of
  http://xkcd.com/552/
  Mike 🙂
  
  Reply
  1. Shailendra
    
    June 7, 2013
    
    >> BTW, your *correlation* reminds me of http://xkcd.com/552/
    
    Not “may be” but certainly – your blog has helped with full certainty.
    
    Reply
    1. Mike
      
      June 7, 2013
      
      Why, thank you very much for your kind words.
      Mike 🙂
      
      Reply
Vishnu Suresh

June 5, 2013

Mike,

This was a very interesting article indeed! For someone beginning to brush off years of rusty-unused math/verbal skills, these articles are a godsend. Thank you! I lookforward to many more articles :).

Regards,
Vishnu

Reply
1. Mike
  
  June 5, 2013
  
  Dear Vishnu,
  I’m very glad you found it helpful. Thank you for your kind words, and best of luck to you.
  Mike 🙂
  
  Reply