You can’t always get what you want, but if you try sometimes, you get what you need. In terms of statistics, we want to know everything that there is to know about a group (or population). But sometimes that is just not feasible. Instead, we deal with approximations of a smaller group (or sample) and hope that the answer we get isn’t too far from the truth. The difference between the truth of the population and the sample is called the sampling variability.
In its most basic definition, sampling variability is the extent to which the measures of a sample differ from the measure of the population. But before we get into the big picture, there are a few details that we need to discuss.
Parameters and Statistics
When is comes to measures involving a population, you can very rarely measure them. For example, it is not really possible to measure the mean height of every American. Instead, you take a random selection of Americans and then measure their mean height.
Knowing the mean height of all Americans is an example of a parameter. A parameter is a value that refers to the population (such a mean, deviation, percentage of sub-groups, etc.) that we cannot actually know. It is impossible to measure a parameters; however, it is possible estimate them using statistics.
A measure that refers to a sample is called a statistic. An example wold be the average height of a random sample of Americans. The parameter of a population never changes because there is only one population, but a statistic changes from sample to sample because there are always unexpected variation between samples. However, if you have enough samples, you generally get close to the parameter.
What is Variability
With all this talk of statistics and parameters, on big question comes up: what if the sample is different from the parameter of the population? Well, it may be. That difference between the sample statistics and the parameter is called sampling variability.
There is always variability in a measure. Variability comes from the fact that not every participant in the sample is the same. For example, the average height of American males is 5’10” but I am 6’3″. I vary from the sample mean, so this introduces some variability.
Generally, we refer to the variability as standard deviation or variance.
Measures of Sampling Variability
When it comes to the parameter estimate, the deviation the measures has from the parameter is called the standard deviation. It is represented by σ. Remember that we can never actually measure or know the true standard deviation of a population. Instead, we estimate it from the standard deviation of the sample.
The sampling variability of the sample is still referred to as the standard deviation, but it is represented by s. The standard deviation is based on the size of the sample. A sample size of 20 may have very different deviation than a sample size of 200, even if they are measuring the same thing.
There is no ideal sample size. Although, each statistical method has a preferred sample size. For example, the generally accepted minimum sample size for t-tests is 30, but for principal component analysis it is in the hundreds.
Uses of Sampling Variability
Sampling variability is useful in most statistical tests because it gives us a sense of different the data are. Like I said earlier, I am not the average height, but there are also some people that are shorter than the average height. The sampling variability is the amount of difference between the measured values and the statistic.
If the variability is low, then there are a small differences between the measured values and the statistic, such as the mean. If the variability is high, then there are large differences between the measured values and the statistic. You generally want data that has a low variability.
Sampling variability is used often to determine the structure of data for analysis. For example, principle component analysis analyzes the differences between the sampling variability of specific measures to determine if there is a connection between variables.
Sampling variability is the difference between the measured value and the true statistic or parameter. The sampling variability is also referred to as standard deviation or variance of the data. It is used in several types of statistical tests to analyze the data for an underlying structure.
I hope that this post has helped you out; I look forward to seeing questions below. Happy statistics!