When I was in college, I needed a 3.80 grade point average (GPA) to earn the highest honor, Summa Cum Laude. During my last semester, I averaged all my previous semesters’ GPAs and got 3.797, which my college would round to 3.80. Victory!
But, it didn’t work that way. Turns out that some semesters counted more than others because the college used a weighted average, which yielded a 3.78.
In statistics, weighted averages account for the fact that not all samples, or parts of the population, are created equally. Let’s take a closer look at my grades and see.
Throwing Weight Around
The general idea of an average is that it represents measurements from a sample, and each measurement had an equally random chance of being chosen from the population. In the case of my crazy college experience, it would mean that I took the same number of classes of the same size each semester. Of course, this was not the case. I have included a list of my credit hours each semester and the respective GPA in the table below.
Now, a typical average involves the sum of all the numbers in the dataset divided by the total number of numbers in the set. This gives an overall picture of the sample. So, for my data, the picture would look like
The average is a great way to great way to get an overall picture of the data. However, we are naturally assuming that the sample is an accurate and evenly weighted representation of the population. For example, if I were to have taken 15 hours worth of classes every semester, then each semester would be have been weighted equally among my entire degree. This means that the mean would reasonably approximate any semester.
But it didn’t.
I got a lot of 4.00s but still graduated with a 3.78 instead of the predicted 3.80. This because not all of my semesters were equal; some had more class hours than others.
For example, one semester I took 19 hours and in another, I took 6. This means that the GPA from the 19-hour semester has more weight than the 6-hour semester because there is more of it. We have to account for this difference in our calculations by treating the weight of the semester as if it were a coefficient to the value or GPA in this case. Here, let’s see what the equation would look like.
In the numerator, notice the number of hours for each semester is multiplied by the respective GPA. This adjusts the GPA for the weight of the semester. In the denominator, the hours for each semester are summed in order to more evenly distribute the adjusted weight of the GPA.
The Nitty Gritty Details
Now that you have seen an example, let’s take a look at the specific, statistical details of the weighted average. The specific statistical formula for the weighted average is
Remember that the i stands for the specific term in the data. Each term will have a weight associated with it that you will need to account for. Don’t forget to account for it in both the numerator and the denominator.
Things to look out for
There are a couple of things that you need to keep in mind when using the weighted average. One is outliers and the other is variance.
Recall that outliers are those data points that seem a little extreme. What I mean is that they seem a little (or a lot) higher or lower than the majority of the data. Applying weight to them means that their effect on the mean matters a little more. It means that they skew the weighted average a little more than the regular average. Just some food for thought.
Variance has to do with how precise your data is. Variance is also based on the mean of your data. This means that if you use a weighted mean, your variance will also need to be weighted. So, any statistical method that uses your mean, such the t-statistic, will also need to use a weighted variance or standard deviation.
Ok, But When Do I Need a Weighted Average?
Let’s try out weighted averages in a couple of examples:
Do you know how tall you are? I mean, most people do. I’m a whopping 6’3″ or 190.5 cm. But is that higher or lower than average for a man, woman, or human in general? Let’s take a look.
The average height of males in the United States is 177 cm. I’m going to use centimeters in this example because it is a continuous variable, while feet and inches is not. So, I’m a little over average there. The average height of females in the United States is 162 cm. I’m still a little over average.
Both of these are average, but it doesn’t tell us the average height of the average person. What I mean is what is the average height of a person selected at random of the population of the United States. NOW we need a weighted average to account for the fact that the population of the US is not equally distributed between females and males.
About 51.5% of the US identifies as female, while the remaining 48.5% identifies as male. So our average should take this unequal weighting into account. Using our set up from a previous example, we get a formula that looks like this
Now our average accounts for the fact that the population is not even weighted. This is the real purpose of weighted averages. They account for unequal weights in the sample or population. In statistics, you will see them most often account for gender, race, ethnicities, and other nominal things.
The weighted average is also used in business to account for prices in goods and services purchased from different vendors. This one is interesting.
Let’s say you have started a clothing line specifically for Doberman pinschers! Congratulations! My wife will be shopping from you very soon since she just got one. But you are going to need some fabric.
So you buy 1,000 yards of beautiful white fabric from one vendor at $1.50 a yard. Since you bought all they had, you have to get more from a second vendor later in your production process. In fact, you buy 1,700 yards at $1.75 a yard. Now, you could work on the cost of $1.63 per yard, but you would be losing money because you bought more at the higher price! So, calculate the weighted average using the yardage as the weight.
This means that your inventory of fabric averages a cost of $1.66 per yard, which should factor into your cost more accurately.
The weighted average is one of those things that is used to more accurately portray a sample in relation to a population. There are some particulars when you want to use it, like outliers and variance, but overall it is a pretty well-rounded way to account for differences in the data. Not to mention all the places that it crops up in school, social science research, and even business. Happy statistics!