Today’s topic for GED math: data analysis questions. Data analysis questions on the GED Mathematical Reasoning subject test take on three major forms:
- Data representations
- Descriptive statistics
Probability is a complex topic that deserves its own post, so we’ve got you covered on that topic here. In this post, we’re going to focus on the first two types of data analysis questions: data representations and descriptive statistics.
The first major type of data analysis question you’ll see are those involving data representations— the visual ways that we present data in graphs, charts, and tables. You’ll need to be able to read and pull information from several different types of visual data representations in order to answer the questions on the GED math test.
The most basic type of data representation is a table, which uses columns and rows to sort data by category.
For example, here’s a table showing high temperatures in Chicago over the course of a given week.
We’ll get LOTS of practice using information from tables in the Descriptive Statistics section below, so for now, we’ll move on to other, more complex representations.
A line graph plots individual data points on x- and y-axes, then connects those points with a line to form what’s called a series.
For example, here’s a line graph showing the same data from above about the temperatures in Chicago:
You might also encounter line graphs that contain more than one series. These are used to compare different types of data. In these cases, each series will use a different color and/or symbol and a key to help you keep track of which data is which.
Here is an example, again using temperatures in Chicago, but adding a little more complexity:
See if you can answer these questions based on the above graph (answers below):
1. Which day had the lowest average temperature?
2. Which day reached the highest temperature?
3. On which two days were the high temperatures the same?
4. On which days did the high temperature sink below Sunday’s low temperature?
5. Which two days had the widest temperature range over the course of the day?
A scatter plot is used to show data in a similar way to a line graph, only there is no line connecting each of the points.
In the example below, hours spent studying are graphed against test scores. Sometimes, a scatter plot includes a line known as a line of best fit. This line is used to show a general trend in the data. The line of best fit in the graph below helps you see the relationship between the variables.
Try the following question:
Based on the graph above, there appears to be what kind of correlation between hours spent studying and test score?
A) Positive correlation
B) Negative correlation
C) No correlation
Bar graphs are used for comparing data. Like line graphs and scatter plots, they use x- and y-axes to plot data. Bar graphs, however, use shaded bars to represent the data instead of individual points.
Here’s an example:
Try to answer this question based on the graph (answer below):
At which educational level is there the greatest income gap between men and women?
The final type of data representation you are likely to see is a pie chart. These are the most different from the other types of graphs we’ve seen. On a pie chart, the entire circle represents a total or a whole. The circle is divided into segments to show the relative sizes of different portions of the whole.
Use the chart above to answer the following questions (answers below):
1. The greatest number of students use which form of transportation?
2. If there are a total of 500 students, how many students ride their bikes to school?
3. If there are a total of 500 students, how many more students take the bus than walk?
Descriptive statistics are just that: descriptive. We need various measures to help us describe and make sense of data we gather. The four types of descriptive statistics you will encounter are mean, median, mode, and range.
Mean is the measure people generally think of when they think of the term average. Mean is calculated by adding up all of the values in the data set, and dividing by the number of values in the set.
Consider the following table:
To find the mean of the GPAs (the average GPA), you would first add up all of the GPAs:
Now, divide that sum by the number of values in the set, 6:
So, the average GPA of this particular chess club team is about 3.76.
You can also use the mean to find the missing value in a data set. Say, for example, you had the following information:
Using a little algebra, you can find Robyn’s GPA. Set up an equation for the mean, filling in all the known values, and the mean, since this information is provided. Let the unknown GPA be x. Since there are 6 GPAs all together, you would divide by 6 to find the mean. So, your equation and solution will look like this:
So, Robyn’s GPA is 3.9.
The median is the data point that falls in the middle of the set of numbers. In other words, it is the data point that has as many data points above it as below it.
Let’s take the example of the Chicago temperature table again.
To find the median, you first need to rearrange the data so that the values are in ascending order:
Now, you need to find the data point that is in the middle of the data set. Since there are 7 data points in this set, you are looking for the value that has 3 values above it, and 3 values below it:
So, 38 is the median value in this set of numbers.
If you have an even number of values in a data set, you need to find the pair of numbers in the middle, and then take the mean of those two values.
For example, let’s take the average Chicago temperature over a two-week period:
Now let’s arrange the values in ascending order:
The data set now has 14 numbers, so there is no middle number. Instead, isolate the middle pair:
This pair has 6 data points below it, and 6 data points above it. To find the median, calculate the mean of this pair:
So, the median of the daily high temperatures for these two weeks is 38.5 degrees.
The mode is the data point that appears most frequently in a set. If no value appears more than once in the set, then the data has no mode. If multiple values appear the same multiple of times, data can have more than one mode.
Let’s look again at the daily high temperatures in Chicago, listed in ascending order:
29,31,34,36, 36,37,38,39, 42,42,42,43,46,59
To find the mode, look for any value that appears more than once:
Since 42 shows up the most in the data set, 42 is the mode.
Range is simply the difference between the highest and lowest value. The smaller the range, the closer together the values are. The larger the range, the more the values are spread out.
For example, let’s look at the ages of people in a local birding group:
First, identify the highest value in the group (62), and the lowest value in the group (36). To calculate the range, subtract the lowest value from the highest:
So, we would say that the range of ages in the birding group is 26.
Line Graph Answers
3. Monday and Wednesday
4. Thursday, Friday, and Saturday
5. Sunday and Wednesday
Scatter Plot Answer
The line of best fit makes it easier to see the relationship. The line slopes up, so there is a positive correlation. If the line sloped down, there would be a negative correlation. If the line were flat or if the data were too scattered to have a line of best fit, there would be no correlation.
Bar Graph Answer
Pie Chart Answers