# Understanding Variance Analysis In statistics, you often want to compare several groups to each other to see if there are differences. These comparisons often use variance analysis. Statistical comparisons can often involve numerous groups and variables, but luckily there are several test designed specifically for analyzing the variance between these groups.

## Variance Analysis Basics

In its most basic form, variance analysis involves comparing what is expected to what is actually observed. Recall that variance is the difference between the observed values and the model values. In this case of univariate data, the models are the measures of central tendency. In the case of simple correlations, or bivariate data, this would mean the difference between the line of best fit and the actual values measured.

However, in the case of multivariate data, which is three of more variables, the challenge increase. This is because you are comparing an empirical model to a theoretical model. The easiest way to do this is to compare groups to an outcome. But the number of groups and variables determines the best method to use. Let’s explore some.

## Comparing Two Groups

The simplest set up is to compare two groups to a single outcome. To do this, the best method is a t-test. A t-test compares the same outcome for two groups. The benefit of the test is that it can be set up two different ways.

A dependent sample t-test compares the same outcome for two groups with the same participants. It is most useful in a before-and-after scenario. For example, say you want to compare student ACT scores before and after a special study session. You would compare the group’s scores before the session to the same group’s scores after the session. The t-test analyzed the variance before and after the study session to see if they are significantly different.

An independent sample t-test compares two groups based on the same outcome. This type of setup is most common in experimental studies. Like a dependent sample t-test, the test compares the variance in one group to the variance in the other. For example, let’s say that you want to determine if a new drug helps headaches. One group (the control) would receive a placebo while the other group (the treatment) would receive the actual medication. The t-test compares the outcome of pain relief of both groups to determine if there is a significant difference in pain relief. The benefit of an independent sample t-test, you don’t need to have the same number of participants in both groups or the same participants in both groups.

## Comparing Multiple Groups

It is also possible to compare three or more groups. This requires a little more thought in setting up your research, but is incredibly useful. For example, let’s say that you want to compare which dosage of pain reliever helps headaches the most. Now, you would want to compare three groups, the control and two treatment groups. For the purpose of this example, let’s say that you want to determine if 400 mg of ibuprofen is better than 200 mg, and that 400 mg is better than 200 mg, and that either of them is better than none.

It is tempting to conduct variance analysis using a series of t-tests, but this is not best. Each t-test has error associated with it, so doing multiple t-tests only increases the amount of error. Instead, we use the special technique, analysis of variance or ANOVA.

### Analysis of Variance

ANOVA is a special method for comparing three or more groups. Instead of analyzing the variance of two groups, it analyzes the variance and the error that occurs in a group and between groups. It compares the groups in the context of a single outcome. In our example, the outcome is pain relief.

For an effective experimental setup, the three groups would be the control (which receives a placebo), the 200 mg group, and the 400 mg group. We would measure the pain relief for each group and then consider if not only the variance in the group is different enough but also if the variance between the groups is enough.

You analyze the variance using the F-ratio. The F-ratio is a ratio of the error variance between the groups compared to the error variance within the group. Like the t-test, the F-ratio compares actual values to a distribution values that helps to determine significance. In this way, one can see if there is a significant difference between the groups.

A word of caution: the F-test only reveals if there is a significant difference. It does not determine which group is different nor how it is different. For that, you need post hoc tests.

### Analysis of Covariance

Naturally, there are some things that may affect how effective ibuprofen is from person to person. For example, weight influences dosage. So, someone that is 300-lbs may not get the same amount of pain relief as someone who is 100-lbs. An instance of one or more variables that can affect the outcome, but are not the treatment, are called analysis of covariance or ANCOVA.

ANCOVA accounts for a treatment effect and other mitigating variables, like weight or age in an experimental trial. In this case, the error variance of the covariates is in the analysis. ANCOVA is generally used when looking for an effect rather than looking for a model to predict outcomes. The benefit of ANCOVA is that you get a much clearer picture of the effect that predictors have when used with covariates.

## The Takeaways

Variance analysis comes in all shapes and sizes, but it has one underlying goal: examine the difference between what is observed and what is expected. For univariate data, it is an examination of observed deviation from a measure of central tendency. For bivariate data, it is the difference between the model value (or the line of best fit) and the observed value. For multivariate data, it involves comparing groups and their effect on one or more outcomes.

I hope that this post helps shed some light on variance analysis. I look forward to any questions that you may have below. Happy statistics!