What do you do to relax? Personally, I enjoy writing about statistics, among other things. But surely, there are other things, like reading or even playing video games. But which is the most relaxing? Well, that calls for an experiment! I know, you love experiments, especially when you get to use fancy new methods like analysis of variance, a.k.a. ANOVA. Let’s explore ANOVA and how it helps us figure out experimental results
Analysis of Variance
Analysis of variance, more commonly called ANOVA, is a statistical method that is designed to compare means of different samples. Essentially, it is a way to compare how different samples in an experiment differ from one another if they differ at all. It is similar to a t-test except that ANOVA is generally used to compare more than two samples.
But John, why can’t we just do a bunch of t-tests?
I like your thinking! But I want you to recall that every statistical method has some error associated with it. Each time you do a t-test, you actually compound the error. This means that the error gets larger for every test you do. What starts off as 5% error for one test can turn into 14% error for three tests! Well above the acceptable limit for most research.
ANOVA is a method that takes these little details into account by comparing the samples not only to each other but also to an overall Grand Mean, Sum of Squares (SS), and Mean Square (s2). It also compares error rates within the groups and between the groups. ANOVA tests the hypothesis that the means of different samples are either different from each other or the population.
This can be a lot to take in, so let’s take a look at some of the little details before we work on an example.
When you use ANOVA, you are testing whether a null hypothesis is true, just like regular hypothesis testing. The difference is that the null hypothesis states that the means of each group are equal. You would state it something like X1 = X2 = X3. ANOVA would tell you that one or all of them are not equal.
ANOVA relies on something called the F-distribution. In short, the F-distribution compares how much variance there is in the groups to how much variance there is between the groups. If the null hypothesis is true, then the variances would be about equal, though we use an F-table of critical values in a similar way to a t-test to determine if the values are similar enough.
Analysis of variance compares means, but to compare them all to each other we need to calculate a Grand Mean. The Grand Mean, GM, is the mean of all the scores. It doesn’t matter what group they belong to, we need a total mean for comparison.
The Sum of Squares, SS, is what you get when you add up all the squared standard deviations. We use this value to calculate something called the Mean Square of Treatment, MStreat, which is the sum of squares divided by the degrees of freedom in the sample (N – number of groups). It tells you the amount of variability between the groups.
The final detail that we are going to talk about is the Error Sum of Squares, SSerror, which refers to the overall variance of the samples. Remember that variance tells you how precise your data is. SSerror is used to calculate the Mean Error Sum of Squares, MSerror. This basically tells us the variability of the data in the group.
Now, that is a lot of terms, so let’s see them in action.
A Relaxing Experiment
Suppose that you are interested in the best way to relax after stats class. After some cursory research, you settle on either reading a book or playing video games as possible choices for relaxation. But which, if either, is best? This is a case for an experiment. You would measure stress for a few students right after stats class, right after relaxing by reading a book, and right after playing video games.
Let’s do that right now! BAM! I have your data ready for you in the table below. We measured stress on a scale from 1 (low) to 10 (high) for a few students under each condition. By the way, these conditions are typically called treatments. I have also squared and summed some values for future use. You’re welcome
ANOVA determines the differences in the means by comparing the mean of squares of the treatments to the mean of squares of the errors. It uses this equation:
But these values have to be calculated from still other things! Gosh darn, there’s a lot of calculating. But that is part of the fun of statistics. Let’s calculate the MStreat first.
MStreat refers to the variation that occurs between the groups. So it is calculated using the mean of each group and the Grand Mean. It also utilizes the degrees of freedom based on the number of groups in the study. In your case, there are three groups, so there are two degrees of freedom. Let’s see
In this equation, the t stands for the treatment. It is the sum of the treatment means minus the grand mean then squared and multiplied by the number of participants in the sample for that treatment. This is how we know it compares between groups. In your case, the value is 60.67.
To get MStreat, we need to take it one step further. We need to divide by the degrees of freedom for the treatment groups. In this case, there are two degrees of freedom since there are three groups. Meaning MStreat = 60.67 ÷ 2 = 30.33. We will use this value later.
Now for the Mean Square Error, MSerror. This gives you a sense of the variability within each group. You will be using a form of the variance formula that uses the squared values of your measurements (hence, their inclusion in the table). The formula looks a little like
Once again, the t here refers to the specific treatment. Nt refers to the number of measurements in each treatment. Your data gives the solution 24.25 if you would like to work it out. But you’re really after MSerror, the amount of variation within the groups. To get this, divide SSerror by the degrees of freedom. The degrees of freedom for the sample in ANOVA is based on the number of groups. Degrees of freedom for error would be N – (the number of groups) or 12 – 3 = 9. You value for MSerror should be 24.25 ÷ 9 = 2.69.
The Interesting Part
You’ve made it through the theory, the data, and the math! Because you’re awesome! Now comes the time to run the final check to see if the means are significantly different. Now we compare the MStreat to MSerror using the F formula from earlier.
Now you have an F-statistic! The df refers to the degrees of freedom in the numerator and the denominator, respectively. You will need all three of these values when you look up the critical value in the F-Table.
The F-table shows a distribution of critical values based on various degrees of freedoms for both MStreat and MSerror. These critical values represent the highest ratio for which the null hypothesis should still be retained, with the usual 5% error threshold.
In the case of 2 degrees of freedom for treatment and 9 degrees of freedom for error, the critical value is 3.00. Since your experimental value is higher than that, you can conclude that the null hypothesis is NOT true and the means are significantly different. In your experiment, different types of relaxation DO provide different amounts of stress relief!
ANOVA Heads Up
Although analysis of variance is a wicked awesome statistical method, there are a few things that you absolutely must keep in mind when running these analyses.
First, ANOVA results alone do not tell which mean is the most different. For example, your results show that relaxation is different from no relaxation, but is reading better than playing video games? To answer that you need to conduct post hoc tests for significant difference. There are a few, like the Tukey Honestly Significant Difference Test, but those are for another post.
Second, ANOVA can be a great tool for coming up with cause-effect results. But the cause-effect conclusion can only be used if the participant were randomly assigned to groups. If you are using pre-selected groups or non-random assignment (like gender) then try to avoid the cause-effect conclusion.
Third, the data should have a reasonably normal distribution. If the data is too skewed, then the variances will affect your calculation one way or another.
Finally, analysis of variance comes in many forms (like analysis of covariance [ANCOVA] and multiple analysis of variance [MANOVA]), but they all have one thing in common. Analysis of variance typically works best with categorical variables versus continuous variables. So consider ANOVA if you are looking into categorical things.
Ultimately, analysis of variance, ANOVA, is a method that allows you to distinguish if the means of three or more groups are significantly different from each other. This method was practically made for experimental setups and can yield wonderful results. It also gives valuable information about the way that one group differs from another within an experimental or quasi-experimental setup. ANOVA requires careful calculation and interpretation, but it opens up a whole new realm of research possibilities. Happy statistics!