2.1. Frequency and Frequency Table#
2.1.1. Frequency Distribution#
Definition - Frequency Distribution
A frequency distribution, also known as a frequency table, summarizes data by grouping it into distinct classes and counting the number of observations within each class. These classes are usually defined by intervals of equal width. The frequency of each class represents the number of data values falling within that interval. Frequency distributions can be presented in either tabular or graphical formats, aiding in the identification of patterns, trends, and outliers within a given dataset [Heumann and Schomaker, 2023, Illowsky and Dean, 2023].
Frequency distribution of qualitative data is a way of summarizing the different categories and how often they occur in a data set. For example, if we have a data set of people’s favorite colors, we can create a frequency distribution of qualitative data by listing each color and the number of times it appears in the data set.
Constructing Frequency Distributions
To create a frequency distribution of qualitative data, follow these steps:
Identify the different categories or values in the data set and write them in the first column of a table.
For each data point, put a tally mark in the second column of the table next to the corresponding category or value.
Count the number of tally marks for each category or value and write the totals in the third column of the table.
Given the following list of color preferences collected from a group of students, perform a frequency analysis to determine the popularity of each color:
Create a frequency distribution for these colors.
Solution: To create a frequency distribution for these colors, we count how many times each color appears in the list. The resulting frequency distribution table would be:
Frequency distribution:
Color |
Frequency |
---|---|
Blue |
7 |
Red |
5 |
Green |
4 |
Pink |
2 |
Yellow |
2 |
The table summarizes the data, showing the number of times each color appeared in the list. For example, there were 7 occurrences of the color “Blue” in the dataset, making it the most frequent color preference among the students. On the other hand, “Yellow” and “Pink” appeared only 2 times each, making them the least frequent color preferences.
How to Create a Frequency Table Using Intervals?
Creating a frequency table with intervals involves organizing data into classes or groups and then counting the number of data points that fall into each class. Here’s how you can do it:
Determine the Range of the Data:
Find the minimum and maximum values in your dataset.
Example: If your dataset is 45, 52, 52, 59, 53, 104, 102, 107, 105, 102, the minimum value is 45 and the maximum value is 107.
Decide the Number of Classes:
Choose the number of intervals (classes) you want to divide your data into. A common choice is between 5 and 10 classes.
Example: We choose 7 classes.
Calculate the Class Width:
Use the formula:
(2.1)#\[\begin{equation}\text{Class Width} = \dfrac{\text{Range}}{\text{Number of Classes}}\end{equation}\]Round up to a convenient number if necessary.
Example:
\[\begin{equation*}\text{Class Width} = \dfrac{107 - 45}{7} \approx 8.86\end{equation*}\]Rounding to a convenient number, we use class width of 10.
Determine the Class Limits:
Start with the minimum value and add the class width to find the upper limit of the first class.
Continue this process to create all classes.
Example: 40–50, 50–60, 60–70, 70–80, 80–90, 90–100, 100–110
Tally the Frequencies:
Count the number of data points that fall into each class.
Example: Count how many data between 40 and 50, 50 and 60, etc.
Create the Frequency Distribution Table:
List the classes and their corresponding frequencies.
Our heart rate, also known as our pulse, is the number of times our heart beats per minute. It varies from person to person and can be an important gauge of heart health. When we’re resting and calm, our heart rate is typically between 60 and 100 beats per minute. However, a heart rate lower than 60 doesn’t necessarily signal a medical problem, especially if we’re taking certain medications or are physically active. Conversely, a heart rate above 100 at rest is considered tachycardia, while a rate below 60 is bradycardia. Remember that individual variations exist, and consulting a healthcare professional is essential if we have concerns about our heart rate. Consider the following dataset representing measured pulse rates from a fictional sample:
Create a frequency distribution for this dataset. Use class-width of 10 starting from 40.
Solution: For this dataset, we create a frequency distribution table:
Lower-class limits:
These represent the minimum values assigned to each class in a frequency distribution.
Example: In the class interval “50–60,” the lower-class limit is 50.
Upper-class limits:
These are the maximum values assigned to each class.
Class midpoints:
Class midpoints are values located in the center of each class.
They are calculated as the average of the lower-class limit and the upper-class limit.
(2.2)#\[\begin{equation}\text{Class midpoint} = \dfrac{\text{Lower-class limit} + \text{Upper-class limit}}{2}\end{equation}\]Class width:
The class width is the difference between two adjacent lower-class limits (or two adjacent lower-class boundaries) in a frequency distribution.
In the given example, the class width is 10.
Pulse Rates |
Frequency |
---|---|
40–50 |
1 |
50–60 |
4 |
60–70 |
28 |
70–80 |
22 |
80–90 |
29 |
90–100 |
21 |
100–110 |
5 |
2.1.2. Relative Frequency Distribution#
Definition - Relative Frequency Distribution
A relative frequency distribution, also known as a percentage frequency distribution, is an alternative form of the standard frequency distribution. In this variation, the frequencies in each class are replaced by relative frequencies (or proportions) or percentages. Regardless of whether we use relative frequencies or percentages, we refer to this variation as the “relative frequency distribution” [Heumann and Schomaker, 2023, Illowsky and Dean, 2023].
The calculations for relative frequencies and percentages are as follows:
Relative Frequency for a Class:
The relative frequency for a class is approximately equal to the ratio of the frequency for that class to the sum of all frequencies.
Mathematically:
Percentage for a Class:
The percentage for a class is approximately equal to the ratio of the frequency for that class to the sum of all frequencies, multiplied by 100%.
Mathematically:
Create a create relative frequency table for Example 2.1.
Solution: To create relative frequencies, we need to follow these steps:
Calculate the Total Frequency: Add up the frequency of all items to get the total frequency. In your table, the total frequency is 20.
Determine Relative Frequency: For each item, divide its frequency by the total frequency to get its relative frequency. For example, for Blue:
Convert to Percentage: To express the relative frequency as a percentage, multiply it by 100. Continuing with the example of Blue:
\[\begin{align*} \text{Relative Frequency of Blue (Percentage)} &= \text{Relative Frequency of Blue} \times 100 \\ &= 0.350 \times 100 = 35.0\% \end{align*}\]
In addition,
Red:
Relative Frequency: \(\dfrac{5}{20} = 0.250\)
Relative Frequency (Percentage): \(0.250 \times 100 = 25.0\%\)
Green:
Relative Frequency: \(\dfrac{4}{20} = 0.200\)
Relative Frequency (Percentage): \(0.200 \times 100 = 20.0\%\)
Pink and Yellow:
Relative Frequency: \(\dfrac{2}{20} = 0.100\)
Relative Frequency (Percentage): \(0.100 \times 100 = 10.0\%\)
The sum of all relative frequencies should equal 1, and the sum of all relative frequencies in percentage should equal 100%. This confirms that the relative frequencies are correctly calculated.
Color |
Frequency |
Relative Frequency |
Relative Frequency (Percentage) |
---|---|---|---|
Blue |
7 |
0.35 |
35.00 |
Red |
5 |
0.25 |
25.00 |
Green |
4 |
0.20 |
20.00 |
Pink |
2 |
0.10 |
10.00 |
Yellow |
2 |
0.10 |
10.00 |
Total |
20 |
1.00 |
100.00 |
Create a create relative frequency table for Example 2.2.
Solution: We need to follow these steps:
Total Frequency: We first sum up the frequency of all pulse rate ranges to get our total frequency. In our table, the total frequency is 110.
Relative Frequency: We then calculate the relative frequency for each pulse rate range by dividing its frequency by the total frequency. For instance, for the 60–70 range:
Relative Frequency in Percentage: To express the relative frequency as a percentage, we multiply the relative frequency by 100. Continuing with the 60–70 range example:
Using this method, we can determine the relative frequencies and their percentage representation for each pulse rate range:
40–50:
Relative Frequency: \(\dfrac{1}{110} = 0.009\)
Relative Frequency (Percentage): \(0.009 \times 100 = 0.909\%\)
50–60:
Relative Frequency: \(\dfrac{4}{110} = 0.036\)
Relative Frequency (Percentage): \(0.036 \times 100 = 3.636\%\)
60–70:
Relative Frequency: \(\dfrac{28}{110} = 0.255\)
Relative Frequency (Percentage): \(0.255 \times 100 = 25.455\%\)
70–80:
Relative Frequency: \(\dfrac{22}{110} = 0.200\)
Relative Frequency (Percentage): \(0.200 \times 100 = 20.000\%\)
80–90:
Relative Frequency: \(\dfrac{29}{110} = 0.264\)
Relative Frequency (Percentage): \(0.264 \times 100 = 26.364\%\)
90–100:
Relative Frequency: \(\dfrac{21}{110} = 0.191\)
Relative Frequency (Percentage): \(0.191 \times 100 = 19.091\%\)
100–110:
Relative Frequency: \(\dfrac{5}{110} = 0.045\)
Relative Frequency (Percentage): \(0.045 \times 100 = 4.545\%\)
The sum of all relative frequencies equals 1, and the sum of all relative frequencies in percentage equals 100%, which confirms that our calculations are accurate. This approach allows us to understand the distribution of pulse rates within our dataset effectively.
Pulse Rates |
Frequency |
Relative Frequency |
Relative Frequency (Percentage) |
---|---|---|---|
40–50 |
1 |
0.009 |
0.909 |
50–60 |
4 |
0.036 |
3.636 |
60–70 |
28 |
0.255 |
25.455 |
70–80 |
22 |
0.200 |
20.000 |
80–90 |
29 |
0.264 |
26.364 |
90–100 |
21 |
0.191 |
19.091 |
100–110 |
5 |
0.045 |
4.545 |
Total |
110 |
1.000 |
100.000 |
A set of data points representing the number of hours students spent studying for an exam:
a. Create a frequency distribution table for the data points.
b. Calculate the relative frequency for each class interval.
Guidelines:
Organize the data into a reasonable number of class intervals (e.g., 0-2 hours, 2-4 hours, etc.).
The frequency distribution table should include columns for class intervals, frequency, and relative frequency.
The relative frequency is calculated by dividing the frequency of each class interval by the total number of data points.
Solution: We will use the following class intervals:
0–2 hours
2–4 hours
4–6 hours
6–8 hours
8–10 hours
Next, we count the number of data points that fall into each class interval to create the frequency distribution table.
Study Hours |
Frequency |
---|---|
0–2 |
4 |
2–4 |
8 |
4–6 |
12 |
6–8 |
12 |
8–10 |
4 |
Total |
40 |
The relative frequency for each class interval is calculated by dividing the frequency of each class interval by the total number of data points (which is 40).
Study Hours |
Frequency |
Relative Frequency |
Relative Frequency (%) |
---|---|---|---|
0–2 |
4 |
4/40 = 0.100 |
10.0% |
2–4 |
8 |
8/40 = 0.200 |
20.0% |
4–6 |
12 |
12/40 = 0.300 |
30.0% |
6–8 |
12 |
12/40 = 0.300 |
30.0% |
8–10 |
4 |
4/40 = 0.100 |
10.0% |
Total |
40 |
1.000 |
100.0% |
Thus,
Study Hours |
Frequency |
Relative Frequency |
Relative Frequency (Percentage) |
---|---|---|---|
0–2 |
4 |
0.100 |
10.000 |
2–4 |
8 |
0.200 |
20.000 |
4–6 |
12 |
0.300 |
30.000 |
6–8 |
12 |
0.300 |
30.000 |
8–10 |
4 |
0.100 |
10.000 |
Total |
40 |
1.000 |
100.000 |
A set of data points representing the number of liters of water consumed per day by 30 individuals:
a. Create a frequency distribution table for the water consumption data points.
b. Calculate the relative frequency for each class interval.
Guidelines:
Organize the data into a reasonable number of class intervals (e.g., 0-1 liters, 1-2 liters, etc.).
The frequency distribution table should include columns for class intervals, frequency, and relative frequency.
The relative frequency is calculated by dividing the frequency of each class interval by the total number of data points.
Solution: We will use the following class intervals:
0–1 liters
1–2 liters
2–3 liters
3–4 liters
Next, we count the number of data points that fall into each class interval to create the frequency distribution table.
Water Consumption (liters) |
Frequency |
---|---|
0–1 |
0 |
1–2 |
10 |
2–3 |
13 |
3–4 |
7 |
Total |
30 |
The relative frequency for each class interval is calculated by dividing the frequency of each class interval by the total number of data points (which is 30).
Water Consumption (liters) |
Frequency |
Relative Frequency |
Relative Frequency (%) |
---|---|---|---|
0–1 |
0 |
0/30 = 0.000 |
0.0% |
1–2 |
10 |
10/30 = 0.333 |
33.3% |
2–3 |
13 |
13/30 = 0.433 |
43.3% |
3–4 |
7 |
7/30 = 0.233 |
23.3% |
Total |
30 |
1.000 |
100.0% |
Thus,
Water Consumption (liters) |
Frequency |
Relative Frequency |
Relative Frequency (Percentage) |
---|---|---|---|
0–1 |
0 |
0.000 |
0.000 |
1–2 |
10 |
0.333 |
33.333 |
2–3 |
13 |
0.433 |
43.333 |
3–4 |
7 |
0.233 |
23.333 |
Total |
30 |
1.000 |
100.000 |