First, some practice questions. The scenario below is relevant to questions #1#3.
There are two sets of letters, and you are going to pick exactly one letter from each set.
Set #1 = {A, B, C, D, E}
Set #2 = {K, L, M, N, O, P}
1) What is the probability of picking a C and an M?

(A) 1/30
(B) 1/15
(C) 1/6
(D) 1/5
(E) 1/3
2) What is the probability of picking a C or an M?

(A) 1/30
(B) 1/15
(C) 1/6
(D) 1/5
(E) 1/3
3) What is the probability of picking two vowels?

(A) 1/30
(B) 1/15
(C) 1/6
(D) 1/5
(E) 1/3
_____________________________________________________________________
4) In a certain corporation, there are 300 male employees and 100 female employees. It is known that 20% of the male employees have advanced degrees and 40% of the females have advanced degrees. If one of the 400 employees is chosen at random, what is the probability this employee has an advanced degree and is female?

(A) 1/20
(B) 1/10
(C) 1/5
(D) 2/5
(E) 3/4
5) In a certain corporation, there are 300 male employees and 100 female employees. It is known that 20% of the male employees have advanced degrees and 40% of the females have advanced degrees. If one of the 400 employees is chosen at random, what is the probability this employee has an advanced degree or is female?

(A) 1/20
(B) 1/10
(C) 1/5
(D) 2/5
(E) 3/4
The simplistic probability rules
Here is the absolute bare minimum you need to know for probability calculations on the GMAT:
“AND” means MULTIPLY
“OR” means ADD
Is this the whole story? Well, not exactly. But if you can’t remember or don’t understand anything else about probability, at least know these two barebones rules, because just this will put ahead of so many people. Just this is enough to solve the problems #1 and #3, although these alone could lead to problems on the others. Before we qualify these simplistic rules, we need to discuss two distinctions.
Disjoint
Two events are disjoint if they are mutually exclusive. In other words, two events are disjoint if the probability of their simultaneous occurrence is zero, that is, it is absolutely impossible to have them both happen at the same time. For example, different faces of a single die are disjoint: under ordinary circumstances, if you roll one die once, you can’t simultaneously get, say, both a 3 and a 5. Those two are disjoint. Suppose we are picking random people and classifying them by their current age. In this process, being in the category “teenager” and being in the category “senior citizen” are disjoint: there is no one we could pick who is simultaneously in both categories. Most categories involving human beings are too messy for the distinction “disjoint” to apply.
If events A and B are disjoint, then we can use the simplified OR rule:
P(A or B) = P(A) + P(B)
That’s the case in which the simplified rule, OR means ADD, works perfectly. If events A and B are not disjoint, then we have to use the generalized OR rule:
P(A or B) = P(A) + P(B) – P(A and B)
The reason for that final term: we need to subtract the overlap. The events in the region “A and B” are included in region A and also in region B, so if we add those two regions, the overlap gets counted twice. We need to subtract it, so it is only counted once like everything else.
Independent
Two events are independent if whether one happens has absolutely no influence on whether the other happens. In other words, knowing about the outcome of one event gives absolutely no information about how the other event will turn out. For example, if I roll two ordinary dice, the outcome of each die is independent of the other die. If I tell you I rolled two dice, and the first die was a 4, then knowing that give you no clue about what the number on the other die might be. On any given day, what the weather is in the San Francisco Bay Area and how the Dow Jones Industrial Average performs are independent: knowing one gives us absolutely no information about how the other turned out.
We have to be careful. If I shuffle a deck of cards, draw one, replace it, reshuffle, and draw another, then the two cards are independent. BUT, if I shuffle the deck, draw one card, and then without replacement draw a second card, then they are not independent. For example, if the first card is the 7 of Hearts, then it is less likely that the second card would be either a 7 or a Heart, because there are fewer of those options among the remaining 51 cards.
Also, notice that there are many human situations which would be independent in a perfect just ideal world, but regrettably are not independent in a real world full of inequities. In a perfect world, gender and corporate promotion would be independent, but in practice, they are not. In a perfect world, race and criminal conviction would be independent, but in practice, they are not.
If events A and B are independent, then we can use the simplified AND rule:
P(A and B) = P(A)*P(B)
That’s the case in which the simplified rule, AND means MULTIPLY, works perfectly. If events A and B are not independent, then things get complicated. Technically, the “generalized AND rule” formula would involve a concept known as “conditional probability“, which would lead into realms of probability theory that are tested less frequently on the GMAT. See that other blog that discusses conditional probability if you want to understand this advanced topic in more detail.
Practice
Having read this post, take another look at those practice questions, and see if you understand them better, before simply reading the explanations below. Be patient with yourself as you work through probability: it takes time to internalize these distinctions, such as “disjoint” and “independent”. There will be more information in the next post in this sequence.
Practice problem solutions
1) Whatever we pick from the first set is independent with whatever we pick from the second set, so we can use the simplified AND rule.
P(first pick = C) = 1/5
P(second pick = M) = 1/6
P(C and M) = P(C)*P(M) = (1/5)*(1/6) = 1/30
Answer = A
2) Picking an M is not disjoint with picking a C — they both could happen on the same round of the game. We have to use the generalized OR rule for this:
P(C or M) = P(C) + P(M) – P(C and M)
Fortunately, we know the first two, and we calculated the value of the third term already in #1.
P(C or M) = P(C) + P(M) – P(C and M)
Answer = E
3) On the first pick, two of the five letters are vowels — A & E — so the probability of picking a vowel on the first pick is 2/5. On the second pick, only one letter out of the six is a vowel — O — so the probability of picking a vowel on the second pick is 1/6. The two picks are independent: what one selects from one set has absolutely no bearing on what one picks from the other set. Therefore, we can use the generalized AND rule.
P(two vowels) = P(vowel on first pick)*P(vowel on second pick) =(2/5)*(1/6) = 2/30 = 1/15
Answer = B
4) Here we have an AND question, and the parameters — gender and advanced degree — are not independent. If I tell you the gender of a certain employee, then that gives me information about how likely it is that this employee has an advanced degree. One parameter gives information about the other, which means they are not independent. Therefore, we cannot use the simplified AND rule. Fortunately, it is relatively easy here to calculate everything directly.
There are 100 female employees, and we know 40% of them have advanced degrees, so there are 40 employees who both are female and have an advanced degree. That’s the number of employees in the AND region. Well, there are 400 employees altogether. Of these 400 total employees, the probability of picking someone in this AND region is
P = 40/400 = 1/10
Answer = B
5) In this corporation, there are 400 total employees. There are 100 women. Of the 300 men, 20% have advanced degrees —10% of 300 must be 30, so 20% of 300 must be 60. Add the women and the men with advanced degrees: 100 + 60 = 160. This is the OR region, full set of individuals that satisfy the condition “has an advanced degree or is female.” Of the 400 employees, what’s the probability of picking one of the 160 in this particular group?
P = 160/400 = 16/40 = 4/10 = 2/5
Answer = D
Try out some more GMAT probability problems here.
Hi Mike, I think the solution to #5 would be more easier like this
probability of people with degrees 100/400 plus people who are women 100/400 minus people who have degree and are also women 40/100
Mike, by all means, but explaining a formula first and then calculate all examples in another way, is just confusing and also the reason for the many responses here and it also confuses me, although I understand your approach, but that makes the formula unnecessary. Still, on test day, one has a lot of stress, therefore a formula also gives some confidence.
However, what do you finally put in the OR formula in the question #5?
My approach was P(1/4) for all females + P(1/4) for all with advanced degress – P (1/4*1/4) for female and advanced degree
but that does not work
Dear Wili: I’m happy to respond. First of all, understand that relying on formulas is a spectacularly bad way to approach the GMAT Quant section: the GMAT regularly writes Quant questions to punish students who rely on formulas. See:
https://magoosh.com/gmat/2014/gmatmaththeusesandabusesofformulas/
Now, this is true in all math, but it’s doubly true with respect to Probability. The hardest thing to appreciate about Probability are the subtle distinctions that require one solution or another. It’s about pattern matching and the rightbrain skill of “how to see” the problem, how to interpret it correctly. Students focused on the leftbrain perspective of “what to do” routinely miss the forest for the trees when it comes to Probability.
Now, the word “OR” appears in #5, so why didn’t I use the “OR” formula? The “OR” formula contains a term for P(A and B), and if we don’t know that, we cannot use the OR formula. You made the egregious mistake of assuming you could multiply for the AND case: you can only multiply if the two categories are independent. Clearly, educational level and gender are NOT independent, because the problem tells us explicitly that the percents of men & women with advanced degrees are different. Right away, that tells us the variables of gender and educational level are NOT independent, which means we can’t multiply to find the AND case, which means we have no way of applying the OR rule. That’s precisely why I didn’t use the OR rule in this problem, because we have absolutely no way to use it.
You have no business using the OR formula until you understand the idea of “disjoint.” You have no business using the AND rule until you understand the idea of “independent.” The formulas detached from those ideas are 100% useless.
Does all this make sense?
Mike
Dear Mike,
Thank you for you work! Everything looks clear to me except for 1 thing – why are educational level and gender not independent categories?
If I look at this question objectively – with no consideration to the real world problems (i mean – there are still countries in the world (unfortunately) where parents care more about providing better education to their sons rather than daughters) this categories seem independent to me. So.. should we consider real world problems approaching probability problems like that?
And can we say that gender and any other category (percentage of unemployment, number of writers, milk buyers, number of those ordered spaghetti in the certain restaurant, number of lottery winners etc.) are always NOT INDEPENDENT?
And what about nationalities? Are those categories – nationality and number of pet owners – are also not independent? Religion and favorite football team? Then everything related to human beings is not independent?
That becomes more philosophic, not math question
Thank you in advance.
Hi Mike,
Regarding Q3, it asks what are the odds of getting 2 vowels…
The answer basically makes it 2 from the first pick and one from the second, which end up 3 vowels.
Should it be divided into 2 options?
1: pick 2 vowels from first group and 0 from second
2: pick 1 vowel from first group and 1 from second
What’s missing in my reasoning?
Dear Ron,
I’m happy to respond. Notice that there is relevant text above the problem — up at the top, it says, “There are two sets of letters, and you are going to pick exactly one letter from each set.” Thus, we can only pick on of those two vowels in the first set.
Does this make sense?
Mike
Mike,
For question 5, why aren’t we adding total number of adv degrees with the number of female with adv degrees to get 140/400, oppose to adding the number of men with adv degrees.
Dear Herpal,
I’m happy to help.
The question asks for the probability that a randomly chosen employee “has an advanced degree OR is female.” In other words, we are going to consider it a “yes” if the person we pick is any woman, advanced degree or not, and it’s also a “yes” if the person is a male with an advanced degree. How do we count all those people? Well, the 100 women include all the women — those with advanced degrees and those without advanced degrees. It doesn’t make sense to add the women with advanced degrees to all women, because then those women would be counted twice. Instead, once we have all women, the only people who count as a “yes” who are not included are the men with advanced degrees. That gives us the correct value of 160.
Alternately, we could figure out 80% of 300 = 240, the men who don’t have advanced degrees. This is the only group not included in the “yes” group, so we could subtract them from the whole — if 240 are in the “no” group, then 400 – 240 = 160 are in the “yes” group.
Does all this make sense?
Mike
Q2: What is the probability of picking a C or an M?
It’s not very clear from the above question if it asks to include probability of p(c) and p(m).
Apparently, it is not since you see p(m and c) getting subtracted in the answer but I think that’s because p(m and c) reflects twice & that’s why needs to be subtracted once.
As I understand from the answer to the question 2, I think, suggests that question means to get the probability “a C or an M or both”.
I say that because following are the two ways I found that out:
1) p(c from set 1).p( any from set2) + p(any from set 1).p(m from set2) – p(c from set1).p(m from set 2) –> This is because p(c).p(m) are included twice, so, subtract once
= 1/5 6/6 + 1/6 5/5 – 1/6 1/5 = 10/30 = 1/3
2) p(c from set 1).p(any other than m from set 2) + p(any other than c from set 1). p(m from set 2) + p(c from set 1). p(m from set 2)
= 1/5.5/6 + 1/6.4/5 + 1/6.1/5
= (5+4+1)/30 = 10/30 = 1/3
Please suggest if I’m wrong here. Thank you very much in advance.
Dear Divine Acclivity,
On the GMAT and in math in general, the word “OR” is NOT understood as an “exclusive OR,” which is often denoted XOR in, say, computer science. In general, “A or B” means just A, just B, or both A & B together. By contrast, “A XOR B” means just A or just B, but NOT both of them together. Obviously, the GMAT would not use XOR notation: if they wanted to indicate this, they would have to spell out explicitly: just A by itself, or B by itself, but not both of them. On the GMAT, it is ALWAYS a mistake to look at a plain ordinary OR and interpret it as an XOR.
Does all this make sense?
Mike
Got it. Thank you dear Mike.
Dear Divine Acclivity,
You are quite welcome, my friend. Best of luck to you!
Mike
Mike in Question 4,
should it not be just 40/100
Total no of female students is 100 and the number with advanced degrees is 40.
If not are the 2 questions different?
1.What is the probability that a student is female an has an advanced degree
2. What is the probability that one of the students with advanced degree is a female?
Thanks
Dear RD,
Question #4 asks: “If one of the 400 employees is chosen at random, what is the probability this employee has an advanced degree and is female?” You are correct that the 40 females with advanced degrees are the “desired” result, the numerator of the probability fraction, but the choice is not made from all female employees (100) — instead, it is made from all 400 employees, so that’s the proper denominator. 40/400 = 1/10. In probability questions, one always has to read carefully, to make sure that one both choosing the right group and choosing from the right group.
Probability questions have to be phrased extremely carefully. Your second question is an intriguing question — of all the people (male & females) with advanced degrees, how many are females. That’s a question that wasn’t asked here. 60 males and 40 females have advanced degrees, so of the 100 employees with advanced degrees, 40$ or 4/10 are females.
Your first question is poorly worded. It seems to mean the same as question #4 here, but that wording would need to be cleaned up to be a GMAT worthy question. It’s not an easy thing to write a perfectly clear and unambiguous GMAT probability question.
Mike
P (a) + P (b) – P (a And b)
1/4 + 1/4 – 1/10 = 2/5
Why is the answer not written as it was just taught above?
Dude,
As is often the case with probability, there is more than one way to think about and solve a problem. Once we calculate P(A) and P(B), the formula you suggest becomes another valid solution. It’s good to see more than one way to solve a problem.
Mike
Why isn’t the probability of A (100/400) AND the probability of B (100/400) equal to 1/16? Where do you get the 1/10 figure from?
The formula I used was:
P(A or B) = P(1/4 + 1/4) – P(1/4 * 1/4), but that doesn’t work at all. I’m having a hard time translating these words into the probability formula…
Dear Katie,
I’m happy to respond. First of all, VERY IMPORTANT: unlike other areas of math, probability is distinctly NOT formulabased. You need to know the formulas, but knowing all the formula is only about 10% of understanding probability. Probability is much more rightbrained — it’s less about stepbystep recipe approaches, and more about patternmatching, less about focusing on “what to do,” and more about “how to see,” about the perceptual choices and the ways of framing the information. With probability, you have to study the explanations carefully, again looking not simply for what to do, but for how the author framed the information, about the perceptual choices which initiated the solution.
On a more practical level, you are making a mistake in your use of the notation, which may be contributing to your confusion. Probability notation is akin to function notation — the event, A or B, is the “input” and the probability is the output. Thus, we might write P(A) = 1/4, but it is wholly incorrect to write P(1/4), to put fractions inside the probability parentheses. That confuses input and output.
The probability formulas are tricky, in part, because there are very specific conditions under which we can use them. For example, the P(A and B) = P(A)*P(B) formula is NOT a generaluse, allapplicable formula. It is very specific: it only is valid if the two events are independent, and in problems #45, gender and advanced education are NOT independent. If they were independent, the same percent of woman would have advanced degrees as for the men, and we know that is not true. Not independent, which means use of the P(A and B) = P(A)*P(B) formula is 100% wrong and forbidden. Everything is contextual in probability, and these framing ideas, such as “independent” or “disjoint” are more important to understand than the formulas themselves.
In #4, we have to figure out the sizes of the individual groups, as I did in the explanations to that problem. In probability, I urge you to read the problem’s official explanation very carefully.
Does all this make sense?
Mike
Thank you very much for taking the time for such a detailed response. I’ll keep at it…
Dear Katie,
You’re quite welcome. Best of luck to you!
Mike
Hi Mike,
I’m confused with question #2. Why are the two sets not mutually exclusive, thus making the probability of C OR M (11/30). Why do we need to subtract the P (C and M)?
Thanks!
Heather
Heather,
That’s a fantastic question. In the process of picking two letters, we wind up with a set of two, a pair. In that pair, it’s possible for C and M to happen at the same time — the result of “get C as a member of the pair” is not mutually exclusive with the result “get M as a member of the pair.” The probability question in #2 is about the pair that results, so we need to consider the pair as a whole, not the individual selections in isolation.
Does this make sense?
Mike
Hi,
I’m confused about the Independent section.
In the previous section, you said “If events A and B are not disjoint, then we have to use the generalized OR rule: P(A or B) = P(A) + P(B) – P(A and B)”
And then you said “If events A and B are not disjoint, then things get complicated.”
So does “complicated” mean using the generalized OR rule?
Also, probability, permutation, combination and counting really confuse me. Do you have any advice in which order of your blogs I should read?
Jen,
I’m very sorry — that was a very confusing typo on my part. What I meant to say is: “If events A and B are not INDEPENDENT, then things get complicated.” I made the correction above, and added a link to a newer blog that discusses these “complications” in a little more detail.
Once again, I’m very sorry for the confusion. Yes, if two events are not DISJOINT, then no problem: we can just use the generalized OR rule. If two events are not INDEPENDENT, that’s where things get a little trickier.
Thank you for pointing this out.
For more on counting & permutations & combinations, see:
http://magoosh.com/gmat/2012/gmatpermutationsandcombinations/
http://magoosh.com/gmat/2012/gmatquanthowtocount/
http://magoosh.com/gmat/2012/gmatmathcalculatingcombinations/
http://magoosh.com/gmat/2013/difficultgmatcountingproblems/
That’s at least a start. Let me know if you have any further questions.
Mike
Thank you so much Mike! Now it makes sense and hopefully, “not independent” question will not appear in my test on Jan 25th.
Thanks again Mike!
Jen.
Jen,
You are quite welcome. Best of luck to you on the 25th, my friend!
Mike
Dear Mike,
In the problem # 5, in the explanation section, there are 100 employees with an advanced degree, isn’t it? Therefore, the solution still remains 2/5?
Dear Mihaela,
The question asks: “What is the probability this employee has an advanced degree OR is female?” If the question simply asked “What is the probability this employee has an advanced degree?”, then that would be the 100 folks with advanced degrees over 400 total, p = 0.25. BUT, we have an OR question. The number of people with advanced degrees, 100, is tricky because it includes both men & woman, and if we also include all 100 women, then those woman with advanced degrees are in the overlap region and get counted twice. When we count them only once, we get men with advanced degrees (60) + ALL women (100) = 160 — 160/400 = 16/40 = 2/5.
Does all this make sense?
Mike
Thank you for the reply!! Yes, this makes sense! I was not paying enough attention to your explanations! I
M.
Mihaela,
You are more than welcome. Best of luck to you!
Mike
#5…I was confused with the solution as well….but my mistake was adding the overlap of women with adv degrees into my solution.
I must tell myself…DON’T DOUBLE DIP…as in….the solution for #5 is correct because we want = ALL WOMEN + ALL ADV DEGREES – DOUBLE DIPPERS…..equivalent to ALL WOMEN + (ALL ADV DEGREE – WOMEN w/ ADV DEGREE) > ALL WOMEN + MEN w/ ADV DEGREE.
Thanks for the challenging questions. This has been helpful!
Dear LC,
That’s a good way to say it — doubledipping is problematic in this context, as it is in many others!! I’m glad you found this helpful. Best of luck to you!
Mike
In question number 5. “Advanced degree or Female”. In this case these two events are mutually exclusive? Because one can be someone who is an advanced degree holder as well as a female.
Karan,
That question explicitly says: “40% of the females have advanced degrees” — if there’s even one person who’s in both categories, female and advanced degree, that means they are NOT mutually exclusive. —– As a general rule, “mutually exclusive” applies to things like dice or playing cards, not to humans. Humans & categories of any sort are always messy and complicated, not pure & simple like dice.
Mike
Hi,
I am also confused with the problem 5. Why are we not subtracting the case of both “advanced degree” and “female”? It seems like the probability of 160/400 includes females who have advanced degree, which we do not want. I thought we only want females who do not have advanced degree or males who have advanced degree.
Can you explain it to me what are some differences of problem 2 and 5?
Thank you!
KC
KC,
In logic, there’s a distinction between “inclusive OR”, which means that “A or B” includes A by itself, B by itself, and A & B together, and “exclusive OR”, which includes A by itself and B by itself but not A & B together. In GMAT math problems, whenever the word “or” appears, you must assume that it is an “inclusive OR”. If what they mean is an “exclusive or”, they would have to specify that explicitly.
There are a number of important differences between #2 & #5, the most significant of which is — in #2, there are two separate choices, a choice from Set #1 and a choice from Set #2. In #5, we are just choosing one person with one of two qualities. Very different.
Mike