2.1. Frequency and Frequency Table#

2.1.1. Frequency Distribution#

Definition - Frequency Distribution

A frequency distribution, also known as a frequency table, summarizes data by grouping it into distinct classes and counting the number of observations within each class. These classes are usually defined by intervals of equal width. The frequency of each class represents the number of data values falling within that interval. Frequency distributions can be presented in either tabular or graphical formats, aiding in the identification of patterns, trends, and outliers within a given dataset [Heumann and Schomaker, 2023, Illowsky and Dean, 2023].

Frequency distribution of qualitative data is a way of summarizing the different categories and how often they occur in a data set. For example, if we have a data set of people’s favorite colors, we can create a frequency distribution of qualitative data by listing each color and the number of times it appears in the data set.

Constructing Frequency Distributions

To create a frequency distribution of qualitative data, follow these steps:

  1. Identify the different categories or values in the data set and write them in the first column of a table.

  2. For each data point, put a tally mark in the second column of the table next to the corresponding category or value.

  3. Count the number of tally marks for each category or value and write the totals in the third column of the table.

Example 2.1

Given the following list of color preferences collected from a group of students, perform a frequency analysis to determine the popularity of each color:

GreenBlueRedBlue
PinkRedBlueYellow
RedBlueGreenBlue
PinkBlueBlueRed
YellowGreenRedGreen

Create a frequency distribution for these colors.

Solution: To create a frequency distribution for these colors, we count how many times each color appears in the list. The resulting frequency distribution table would be:

Frequency distribution:

Color

Frequency

Blue

7

Red

5

Green

4

Pink

2

Yellow

2

The table summarizes the data, showing the number of times each color appeared in the list. For example, there were 7 occurrences of the color “Blue” in the dataset, making it the most frequent color preference among the students. On the other hand, “Yellow” and “Pink” appeared only 2 times each, making them the least frequent color preferences.

How to Create a Frequency Table Using Intervals?

Creating a frequency table with intervals involves organizing data into classes or groups and then counting the number of data points that fall into each class. Here’s how you can do it:

  1. Determine the Range of the Data:

    • Find the minimum and maximum values in your dataset.

    • Example: If your dataset is 45, 52, 52, 59, 53, 104, 102, 107, 105, 102, the minimum value is 45 and the maximum value is 107.

  2. Decide the Number of Classes:

    • Choose the number of intervals (classes) you want to divide your data into. A common choice is between 5 and 10 classes.

    • Example: We choose 7 classes.

  3. Calculate the Class Width:

    • Use the formula:

    (2.1)#\[\begin{equation}\text{Class Width} = \dfrac{\text{Range}}{\text{Number of Classes}}\end{equation}\]
    • Round up to a convenient number if necessary.

    • Example:

    \[\begin{equation*}\text{Class Width} = \dfrac{107 - 45}{7} \approx 8.86\end{equation*}\]

    Rounding to a convenient number, we use class width of 10.

  4. Determine the Class Limits:

    • Start with the minimum value and add the class width to find the upper limit of the first class.

    • Continue this process to create all classes.

    • Example: 40–50, 50–60, 60–70, 70–80, 80–90, 90–100, 100–110

  5. Tally the Frequencies:

    • Count the number of data points that fall into each class.

    • Example: Count how many data between 40 and 50, 50 and 60, etc.

  6. Create the Frequency Distribution Table:

    • List the classes and their corresponding frequencies.

Example 2.2

Our heart rate, also known as our pulse, is the number of times our heart beats per minute. It varies from person to person and can be an important gauge of heart health. When we’re resting and calm, our heart rate is typically between 60 and 100 beats per minute. However, a heart rate lower than 60 doesn’t necessarily signal a medical problem, especially if we’re taking certain medications or are physically active. Conversely, a heart rate above 100 at rest is considered tachycardia, while a rate below 60 is bradycardia. Remember that individual variations exist, and consulting a healthcare professional is essential if we have concerns about our heart rate. Consider the following dataset representing measured pulse rates from a fictional sample:

98, 88, 74, 67, 80, 98, 78, 82, 70, 70, 83, 95, 99, 83, 62, 81, 61, 83, 89, 97, 61, 80, 92, 71, 81, 84, 86, 87, 75, 74, 62, 96, 66, 80, 68, 98, 77, 63, 84, 73, 68, 85, 61, 79, 87, 66, 67, 94, 73, 76, 95, 99, 63, 61, 65, 63, 88, 77, 85, 93, 69, 95, 73, 90, 74, 67, 73, 82, 99, 80, 75, 77, 83, 85, 84, 88, 74, 60, 84, 66, 68, 83, 60, 67, 83, 70, 76, 67, 94, 94, 92, 64, 98, 87, 66, 68, 67, 71, 93, 92, 45, 52, 52, 59, 53, 104, 102, 107, 105, 102

Create a frequency distribution for this dataset. Use class-width of 10 starting from 40.

Solution: For this dataset, we create a frequency distribution table:

  1. Lower-class limits:

    • These represent the minimum values assigned to each class in a frequency distribution.

    • Example: In the class interval “50–60,” the lower-class limit is 50.

  2. Upper-class limits:

    • These are the maximum values assigned to each class.

  3. Class midpoints:

    • Class midpoints are values located in the center of each class.

    • They are calculated as the average of the lower-class limit and the upper-class limit.

    (2.2)#\[\begin{equation}\text{Class midpoint} = \dfrac{\text{Lower-class limit} + \text{Upper-class limit}}{2}\end{equation}\]
  4. Class width:

    • The class width is the difference between two adjacent lower-class limits (or two adjacent lower-class boundaries) in a frequency distribution.

    • In the given example, the class width is 10.

Pulse Rates

Frequency

40–50

1

50–60

4

60–70

28

70–80

22

80–90

29

90–100

21

100–110

5

2.1.2. Relative Frequency Distribution#

Definition - Relative Frequency Distribution

A relative frequency distribution, also known as a percentage frequency distribution, is an alternative form of the standard frequency distribution. In this variation, the frequencies in each class are replaced by relative frequencies (or proportions) or percentages. Regardless of whether we use relative frequencies or percentages, we refer to this variation as the “relative frequency distribution” [Heumann and Schomaker, 2023, Illowsky and Dean, 2023].

The calculations for relative frequencies and percentages are as follows:

  1. Relative Frequency for a Class:

    • The relative frequency for a class is approximately equal to the ratio of the frequency for that class to the sum of all frequencies.

    • Mathematically:

(2.3)#\[\begin{equation} \text{Relative frequency for a class} \approx \frac{\text{frequency for a class}}{\text{sum of all frequencies}} \end{equation}\]
  1. Percentage for a Class:

    • The percentage for a class is approximately equal to the ratio of the frequency for that class to the sum of all frequencies, multiplied by 100%.

    • Mathematically:

(2.4)#\[\begin{equation} \text{Percentage for a class} \approx \frac{\text{frequency for a class}}{\text{sum of all frequencies}} \times 100\% \end{equation}\]

Example 2.3

Create a create relative frequency table for Example 2.1.

Solution: To create relative frequencies, we need to follow these steps:

  1. Calculate the Total Frequency: Add up the frequency of all items to get the total frequency. In your table, the total frequency is 20.

  2. Determine Relative Frequency: For each item, divide its frequency by the total frequency to get its relative frequency. For example, for Blue:

\[\begin{equation*} \text{Relative Frequency of Blue} = \dfrac{\text{Frequency of Blue}}{\text{Total Frequency}} = \dfrac{7}{20} = 0.350 \end{equation*}\]
  1. Convert to Percentage: To express the relative frequency as a percentage, multiply it by 100. Continuing with the example of Blue:

    \[\begin{align*} \text{Relative Frequency of Blue (Percentage)} &= \text{Relative Frequency of Blue} \times 100 \\ &= 0.350 \times 100 = 35.0\% \end{align*}\]

In addition,

  • Red:

    • Relative Frequency: \(\dfrac{5}{20} = 0.250\)

    • Relative Frequency (Percentage): \(0.250 \times 100 = 25.0\%\)

  • Green:

    • Relative Frequency: \(\dfrac{4}{20} = 0.200\)

    • Relative Frequency (Percentage): \(0.200 \times 100 = 20.0\%\)

  • Pink and Yellow:

    • Relative Frequency: \(\dfrac{2}{20} = 0.100\)

    • Relative Frequency (Percentage): \(0.100 \times 100 = 10.0\%\)

The sum of all relative frequencies should equal 1, and the sum of all relative frequencies in percentage should equal 100%. This confirms that the relative frequencies are correctly calculated.

Color

Frequency

Relative Frequency

Relative Frequency (Percentage)

Blue

7

0.35

35.00

Red

5

0.25

25.00

Green

4

0.20

20.00

Pink

2

0.10

10.00

Yellow

2

0.10

10.00

Total

20

1.00

100.00

Example 2.4

Create a create relative frequency table for Example 2.2.

Solution: We need to follow these steps:

  1. Total Frequency: We first sum up the frequency of all pulse rate ranges to get our total frequency. In our table, the total frequency is 110.

  2. Relative Frequency: We then calculate the relative frequency for each pulse rate range by dividing its frequency by the total frequency. For instance, for the 60–70 range:

\[\begin{equation*} \text{Relative Frequency for 60–70} = \dfrac{\text{Frequency for 60–70}}{\text{Total Frequency}} = \dfrac{28}{110} = 0.255 \end{equation*}\]
  1. Relative Frequency in Percentage: To express the relative frequency as a percentage, we multiply the relative frequency by 100. Continuing with the 60–70 range example:

\[\begin{align*} \text{Relative Frequency for 60–70 (Percentage)} &= \text{Relative Frequency for 60–70} \times 100 \\ &= 0.255 \times 100 = 25.455\% \end{align*}\]

Using this method, we can determine the relative frequencies and their percentage representation for each pulse rate range:

  • 40–50:

    • Relative Frequency: \(\dfrac{1}{110} = 0.009\)

    • Relative Frequency (Percentage): \(0.009 \times 100 = 0.909\%\)

  • 50–60:

    • Relative Frequency: \(\dfrac{4}{110} = 0.036\)

    • Relative Frequency (Percentage): \(0.036 \times 100 = 3.636\%\)

  • 60–70:

    • Relative Frequency: \(\dfrac{28}{110} = 0.255\)

    • Relative Frequency (Percentage): \(0.255 \times 100 = 25.455\%\)

  • 70–80:

    • Relative Frequency: \(\dfrac{22}{110} = 0.200\)

    • Relative Frequency (Percentage): \(0.200 \times 100 = 20.000\%\)

  • 80–90:

    • Relative Frequency: \(\dfrac{29}{110} = 0.264\)

    • Relative Frequency (Percentage): \(0.264 \times 100 = 26.364\%\)

  • 90–100:

    • Relative Frequency: \(\dfrac{21}{110} = 0.191\)

    • Relative Frequency (Percentage): \(0.191 \times 100 = 19.091\%\)

  • 100–110:

    • Relative Frequency: \(\dfrac{5}{110} = 0.045\)

    • Relative Frequency (Percentage): \(0.045 \times 100 = 4.545\%\)

The sum of all relative frequencies equals 1, and the sum of all relative frequencies in percentage equals 100%, which confirms that our calculations are accurate. This approach allows us to understand the distribution of pulse rates within our dataset effectively.

Pulse Rates

Frequency

Relative Frequency

Relative Frequency (Percentage)

40–50

1

0.009

0.909

50–60

4

0.036

3.636

60–70

28

0.255

25.455

70–80

22

0.200

20.000

80–90

29

0.264

26.364

90–100

21

0.191

19.091

100–110

5

0.045

4.545

Total

110

1.000

100.000

Example 2.5

A set of data points representing the number of hours students spent studying for an exam:

1, 2, 3, 5, 1, 2, 1, 2, 5, 1, 4, 7, 5, 6, 3, 8, 2, 6, 7, 5, 5, 6, 7, 8, 9, 5, 4, 6, 7, 3, 2, 5, 6, 7, 4, 5, 6, 7, 8, 5
  • a. Create a frequency distribution table for the data points.

  • b. Calculate the relative frequency for each class interval.

Guidelines:

  • Organize the data into a reasonable number of class intervals (e.g., 0-2 hours, 2-4 hours, etc.).

  • The frequency distribution table should include columns for class intervals, frequency, and relative frequency.

  • The relative frequency is calculated by dividing the frequency of each class interval by the total number of data points.

Solution: We will use the following class intervals:

  • 0–2 hours

  • 2–4 hours

  • 4–6 hours

  • 6–8 hours

  • 8–10 hours

Next, we count the number of data points that fall into each class interval to create the frequency distribution table.

Study Hours

Frequency

0–2

4

2–4

8

4–6

12

6–8

12

8–10

4

Total

40

The relative frequency for each class interval is calculated by dividing the frequency of each class interval by the total number of data points (which is 40).

Study Hours

Frequency

Relative Frequency

Relative Frequency (%)

0–2

4

4/40 = 0.100

10.0%

2–4

8

8/40 = 0.200

20.0%

4–6

12

12/40 = 0.300

30.0%

6–8

12

12/40 = 0.300

30.0%

8–10

4

4/40 = 0.100

10.0%

Total

40

1.000

100.0%

Thus,

Study Hours

Frequency

Relative Frequency

Relative Frequency (Percentage)

0–2

4

0.100

10.000

2–4

8

0.200

20.000

4–6

12

0.300

30.000

6–8

12

0.300

30.000

8–10

4

0.100

10.000

Total

40

1.000

100.000

Example 2.6

A set of data points representing the number of liters of water consumed per day by 30 individuals:

2, 1, 3, 2.5, 1.5, 3, 2, 1, 2.5, 3, 1.5, 2, 3, 2.5, 1, 2, 3, 1.5, 2, 2.5, 3, 1, 2, 2.5, 1.5, 3, 2, 1, 2.5, 1.5
  • a. Create a frequency distribution table for the water consumption data points.

  • b. Calculate the relative frequency for each class interval.

Guidelines:

  • Organize the data into a reasonable number of class intervals (e.g., 0-1 liters, 1-2 liters, etc.).

  • The frequency distribution table should include columns for class intervals, frequency, and relative frequency.

  • The relative frequency is calculated by dividing the frequency of each class interval by the total number of data points.

Solution: We will use the following class intervals:

  • 0–1 liters

  • 1–2 liters

  • 2–3 liters

  • 3–4 liters

Next, we count the number of data points that fall into each class interval to create the frequency distribution table.

Water Consumption (liters)

Frequency

0–1

0

1–2

10

2–3

13

3–4

7

Total

30

The relative frequency for each class interval is calculated by dividing the frequency of each class interval by the total number of data points (which is 30).

Water Consumption (liters)

Frequency

Relative Frequency

Relative Frequency (%)

0–1

0

0/30 = 0.000

0.0%

1–2

10

10/30 = 0.333

33.3%

2–3

13

13/30 = 0.433

43.3%

3–4

7

7/30 = 0.233

23.3%

Total

30

1.000

100.0%

Thus,

Water Consumption (liters)

Frequency

Relative Frequency

Relative Frequency (Percentage)

0–1

0

0.000

0.000

1–2

10

0.333

33.333

2–3

13

0.433

43.333

3–4

7

0.233

23.333

Total

30

1.000

100.000