Confidence Intervals for a Single Population Mean When \sigma is Known

6.2. Confidence Intervals for a Single Population Mean When $\sigma$ is Known#

When estimating the population mean ($\mu$) using a sample, and the population standard deviation ($\sigma$) is known, a confidence interval can be constructed. This procedure ensures that the range within which the true population mean lies is determined with a specified level of confidence. The following algorithm outlines the steps involved in this estimation process.

6.2.1. Margin of Error (E) for Estimating $\mu$#

The margin of error is a critical value in statistics that indicates the potential range of deviation between the sample statistic and the true population parameter. It is represented by $E$ and is calculated as follows:

(6.1)#\[\begin{equation} E = z_{\alpha/2} \cdot \dfrac{\sigma}{\sqrt{n}} \end{equation}\]

Where:

$z_{\alpha/2}$ is the critical z-value corresponding to the desired confidence level.
$\sigma$ is the known standard deviation of the population.
$n$ is the sample size.

When the sample comes from a normal distribution with a known variance, the interval estimate for $\mu$ can be expressed as the range from $\overline{x} - E$ to $\overline{x} + E$, where $\overline{x}$ is the sample mean.

Fig. 6.7 visually represents the concept of the margin of error (E) in estimating the population mean ($\mu$). It shows a range centered around the sample mean ($\overline{x}$), extending from $\overline{x} - E$ to $\overline{x} + E$. This range indicates the interval within which the true population mean is likely to fall, considering the confidence level and sample size.

Fig. 6.7 Visualizing the Margin of Error in Estimating Population Mean when $\sigma$ is known.#

6.2.2. Procedure for Estimating a Population Mean with Known Standard Deviation#

When estimating the population mean ($\mu$) using a sample, and the population standard deviation ($\sigma$) is known, a confidence interval can be constructed. This procedure ensures that the range within which the true population mean lies is determined with a specified level of confidence. The following algorithm outlines the steps involved in this estimation process.

Algorithm 6.1 (Estimating a Population Mean with Known Standard Deviation)

Objective: To calculate a confidence interval for the population mean ($\mu$).

Prerequisites:

A simple random sample is obtained.
The population is normally distributed, or the sample size is sufficiently large.
The population standard deviation ($\sigma$) is known.

Method:

Determine the Critical Value: For a specified confidence level ($1 - \alpha$), find the critical z-value ($z_{\alpha/2}$) from the z-distribution table.
Calculate the Interval: Construct the confidence interval for $\mu$ using the formula:

(6.2)#\[\begin{equation} \left( \overline{x} - z_{\alpha/2} \cdot \dfrac{\sigma}{\sqrt{n}},~\overline{x} + z_{\alpha/2} \cdot \dfrac{\sigma}{\sqrt{n}} \right) \end{equation}\]

Where:

$\overline{x}$ is the sample mean.
$n$ is the sample size.
$z_{\alpha/2}$ is the critical z-value obtained in step 1.

This interval provides the range within which the true population mean is likely to be found with the given level of confidence.

Example 6.8

Use a confidence level of 95% and estimate the population mean price ($\mu$) of all new laptops using the sample data provided below. We’ll assume that the population standard deviation ($\sigma$) of all such prices is $500.

Sample Data:

2185.43, 4778.21, 3793.97, 3193.96, 1202.08, 1201.98, 761.38, 4397.79, 3205.02, 3686.33, 592.63, 4864.59, 4245.99, 1455.53, 1318.21, 1325.32, 1869.09, 2861.4, 2443.75, 1810.53

Solution: First, we calculate the sample mean ($\overline{x}$):

\[\begin{equation*} \overline{x} = \dfrac{1}{n} \sum_{i=1}^{n} x_i \end{equation*}\]

Where $n$ is the number of samples, and $x_i$ is each sample value.

After calculating the sample mean, we find:

\[\begin{equation*} \overline{x} = 2559.6595 \end{equation*}\]

Next, we calculate the standard error of the mean ($\sigma_{\overline{x}}$):

\[\begin{equation*} \sigma_{\overline{x}} = \dfrac{\sigma}{\sqrt{n}} = \dfrac{500}{\sqrt{20}} = 111.8034 \end{equation*}\]

Assuming a confidence level of 95% and a corresponding $z_{\alpha/2}$ value of 1.96, we can calculate the confidence interval for the population mean:

\[\begin{equation*} \overline{x} - z_{\alpha/2} \cdot \sigma_{\overline{x}} = 2559.6595 - 1.96 \cdot 111.8034 = 2340.52 \end{equation*}\]

\[\begin{equation*} \overline{x} + z_{\alpha/2} \cdot \sigma_{\overline{x}} = 2559.6595 + 1.96 \cdot 111.8034 = 2778.79 \end{equation*}\]

So, the 95% confidence interval for the population mean price of new laptops is:

\[\begin{equation*} \$2340.52 \leq \mu \leq \$2778.79 \end{equation*}\]

This interval estimates the range in which the true population mean price is likely to fall with 95% confidence. Note that the actual calculation of the sample mean ($\overline{x}$) required adding up all the sample values and dividing by the number of samples (20 in this case).

Fig. 6.8 provides a visual representation of the 95% confidence interval for the population mean laptop price:

The center line represents the sample mean ($\overline{x} = \$2559.66$), our best point estimate of the population mean.
The shaded area represents the 95% confidence interval, ranging from $2340.52 to $2778.79.
The width of the shaded area is the margin of error ($438.27), indicating the precision of our estimate.

Fig. 6.8 Visualization of the 95% Confidence Interval for the Population Mean Laptop Price#

This visualization helps us interpret the results in the context of laptop prices:

We can be 95% confident that the true population mean price of new laptops falls somewhere within the shaded area.
While the sample mean is our best single-value estimate, the shaded area acknowledges the uncertainty in this estimate.
A narrower shaded area would indicate a more precise estimate of the population mean laptop price.

This interval is based on our sample of 20 laptops and the assumed population standard deviation of $500, using the normal distribution as justified by the Central Limit Theorem.

Example 6.9

The Kitchen Remodeling Cost Index provides insights into the budget amounts for kitchen remodeling projects. The table below displays the budgets, in dollars, of 50 randomly sampled kitchen remodeling jobs across the country.

4250, 3890, 4675, 5120, 4385, 4150, 3905, 4780, 4025, 4950, 5800, 4765, 6250, 3865, 3450, 5200, 4140, 3050, 4155, 5560, 4425, 2595, 4100, 4540, 3515, 3920, 3880, 5600, 3615, 4500, 3650, 2220, 2995, 6140, 5655, 3125, 3450, 3065, 4050, 3150, 1630, 5650, 6065, 3320, 2470

a. Determine a point estimate for the population mean budget, $\mu$, for such kitchen remodeling jobs. Interpret your answer in words.
(Note: The sum of the data is $210,000.)
b. Obtain a 95% confidence interval for the population mean budget, $\mu$, for such kitchen remodeling jobs and interpret your result in words. Assume that the population standard deviation of budgets for kitchen remodeling jobs is $1600.

Solution:

a. To determine a point estimate for the population mean budget, $\mu$, we use the sample mean ($\overline{x}$) as the point estimate. Given the sum of the sample data is $210,000 and there are 50 samples, the sample mean is:

\[\begin{equation*} \overline{x} = \dfrac{\text{Sum of sample data}}{\text{Number of samples}} = \dfrac{210,000}{50} = 4200 \end{equation*}\]

So, the point estimate for the population mean budget for kitchen remodeling jobs is $4200. This means that based on our sample, we estimate that the average budget for kitchen remodeling projects across the population is $4200.

b. To obtain a 95% confidence interval for the population mean budget, we first calculate the standard error of the mean ($\sigma_{\overline{x}}$):

\[\begin{equation*} \sigma_{\overline{x}} = \dfrac{\sigma}{\sqrt{n}} = \dfrac{1600}{\sqrt{50}} = 226.27 \end{equation*}\]

Assuming a confidence level of 95% and a corresponding $z_{\alpha/2}$ value of 1.96, the confidence interval is calculated as:

\[\begin{equation*} \overline{x} - z_{\alpha/2} \cdot \sigma_{\overline{x}} = 4200 - 1.96 \cdot 226.27 = 3756.50 \end{equation*}\]

\[\begin{equation*} \overline{x} + z_{\alpha/2} \cdot \sigma_{\overline{x}} = 4200 + 1.96 \cdot 226.27 = 4643.50 \end{equation*}\]

Therefore, the 95% confidence interval for the population mean budget for kitchen remodeling jobs is:

\[\begin{equation*} \$3756.50 \leq \mu \leq \$4643.50 \end{equation*}\]

This interval suggests that we can be 95% confident that the true average budget for kitchen remodeling projects across the population lies between $3756.50 and $4643.50. This information can be useful for businesses and individuals planning their budget for such projects.

Fig. 6.9 provides a visual representation of the 95% confidence interval for the population mean kitchen remodeling budget:

The center line represents the sample mean ($\overline{x} = \$4200$), our best point estimate of the population mean.
The shaded area represents the 95% confidence interval, ranging from $3756.50 to $4643.50.
The width of the shaded area is the margin of error ($887), indicating the precision of our estimate.

Fig. 6.9 Visualization of the 95% Confidence Interval for the Population Mean Kitchen Remodeling Budget#

This visualization helps us interpret the results in the context of kitchen remodeling budgets:

We can be 95% confident that the true population mean budget for kitchen remodeling projects falls somewhere within the shaded area.
While the sample mean of $4200 is our best single-value estimate, the shaded area acknowledges the uncertainty in this estimate.
The width of the shaded area suggests that our estimate has a precision of about ±$443.50.

This interval is based on our sample of 50 kitchen remodeling jobs and the assumed population standard deviation of $1600, using the normal distribution as justified by the Central Limit Theorem.

Example 6.10

The National Car Maintenance Costs Index provides information on the average costs of car maintenance services. The following table displays the costs, in dollars, of 40 randomly sampled car maintenance jobs from various service centers nationwide.

340, 285, 450, 390, 310, 275, 460, 430, 395, 415, 500, 475, 550, 420, 305, 490, 435, 560, 385, 520, 410, 235, 390, 425, 350, 370, 365, 480, 305, 420, 545, 225, 590, 510, 475, 515

a. Determine a point estimate for the population mean cost, $\mu$, for car maintenance services. Interpret your answer in words.
(Note: The sum of the data is $15,000.)

b. Obtain a 95% confidence interval for the population mean cost, $\mu$, for car maintenance services and interpret your result in words. Assume that the population standard deviation of costs for car maintenance services is $75.

Solution:

a. To determine a point estimate for the population mean cost, $\mu$, we use the sample mean ($\overline{x}$) as the point estimate. Given the sum of the sample data is $15,000 and there are 40 samples, the sample mean is:

\[\begin{equation*} \overline{x} = \dfrac{\text{Sum of sample data}}{\text{Number of samples}} = \dfrac{15,000}{40} = 375 \end{equation*}\]

So, the point estimate for the population mean cost for car maintenance services is $375. This means that based on our sample, we estimate that the average cost for car maintenance services across the population is $375.

b. To obtain a 95% confidence interval for the population mean cost, we first calculate the standard error of the mean ($\sigma_{\overline{x}}$):

\[\begin{equation*} \sigma_{\overline{x}} = \dfrac{\sigma}{\sqrt{n}} = \dfrac{75}{\sqrt{40}} = 11.86 \end{equation*}\]

Assuming a confidence level of 95% and a corresponding $z_{\alpha/2}$ value of 1.96, the confidence interval is calculated as:

\[\begin{equation*} \overline{x} - z_{\alpha/2} \cdot \sigma_{\overline{x}} = 375 - 1.96 \cdot 11.86 = 351.76 \end{equation*}\]

\[\begin{equation*} \overline{x} + z_{\alpha/2} \cdot \sigma_{\overline{x}} = 375 + 1.96 \cdot 11.86 = 398.24 \end{equation*}\]

Therefore, the 95% confidence interval for the population mean cost for car maintenance services is:

\[\begin{equation*} \$351.76 \leq \mu \leq \$398.24 \end{equation*}\]

This interval suggests that we can be 95% confident that the true average cost for car maintenance services across the population lies between $351.76 and $398.24. This information can be valuable for service centers and customers to understand the typical range of maintenance costs.

Fig. 6.10 provides a visual representation of the 95% confidence interval for the population mean car maintenance cost:

The center line represents the sample mean ($\overline{x} = \$375$), our best point estimate of the population mean.
The shaded area represents the 95% confidence interval, ranging from $351.76 to $398.24.
The width of the shaded area is the margin of error ($46.48), indicating the precision of our estimate.

Fig. 6.10 Visualization of the 95% Confidence Interval for the Population Mean Car Maintenance Cost#

This visualization helps us interpret the results in the context of car maintenance costs:

We can be 95% confident that the true population mean cost for car maintenance services falls somewhere within the shaded area.
While the sample mean of $375 is our best single-value estimate, the shaded area acknowledges the uncertainty in this estimate.
The relatively narrow width of the shaded area suggests a fairly precise estimate, which can be attributed to the large sample size (40) and the assumed population standard deviation.

This interval is based on our sample of 40 car maintenance jobs and the assumed population standard deviation of $75, using the normal distribution as justified by the Central Limit Theorem.

Example 6.11

The Environmental Protection Agency (EPA) is monitoring the level of a particular air pollutant in a city. The EPA has established that the population standard deviation of the daily average pollutant level is 8 $\mu g/m^3$. A random sample of 30 days is selected, and the mean of the daily average pollutant levels is found to be 75 $\mu g/m^3$.

Solution: Given:

Sample mean ($\bar{x}$) = 75 $\mu g/m^3$
Population standard deviation ($\sigma$) = 8 $\mu g/m^3$
Sample size ($n$) = 30
For a 95% confidence interval, $z_{\alpha/2}$ is approximately 1.96 (from Z-tables)

We can calculate the confidence interval:

Step 1: Calculate the sample mean:

\[\begin{equation*}\bar{x} = 75~ \mu g/m^3\end{equation*}\]

Step 2: Determine the margin of error:

\[\begin{equation*}E \approx 1.96 \left(\dfrac{8}{\sqrt{30}}\right) ≈ 2.86 ~ \mu g/m^3\end{equation*}\]

Step 3: Calculate the confidence interval:
- Confidence interval (CI) = (75 - 2.86, 75 + 2.86) $\mu g/m^3$
- Confidence interval (CI) = (72.14, 77.86) $\mu g/m^3$

So, we can say with 95% confidence that the true mean daily average pollutant level in the city is between 72.14 $\mu g/m^3$ and 77.86 $\mu g/m^3$.

Fig. 6.11 provides a visual representation of the 95% confidence interval for the mean daily average pollutant level:

The center line represents the sample mean ($\bar{x} = 75 \mu g/m^3$), our best point estimate of the population mean.
The shaded area represents the 95% confidence interval, ranging from 72.14 $\mu g/m^3$ to 77.86 $\mu g/m^3$.
The width of the shaded area is the margin of error (2.86 $\mu g/m^3$), indicating the precision of our estimate.

Fig. 6.11 Visualization of the 95% Confidence Interval for the Mean Daily Average Pollutant Level#

This visualization helps us interpret the results in the context of air pollution levels:

We can be 95% confident that the true population mean daily average pollutant level falls somewhere within the shaded area.
While the sample mean of 75 $\mu g/m^3$ is our best single-value estimate, the shaded area acknowledges the uncertainty in this estimate.
The relatively narrow width of the shaded area suggests a fairly precise estimate, which can be attributed to the large sample size (30) and the known population standard deviation.

This interval is based on our sample of 30 days and the known population standard deviation of 8 $\mu g/m^3$, using the normal distribution as justified by the Central Limit Theorem.

Example 6.12

A financial analyst wants to estimate the average annual return of a particular stock. Historical data suggests that the population standard deviation of the stock’s annual return is 5%. The analyst selects a random sample of 50 years of returns and calculates a sample mean annual return of 7%.

Calculate a 99% confidence interval for the true average annual return of the stock.

Solution: Given:

$\bar{x} = 7\%$
$\sigma = 5\%$
$n = 50$
For a 99% confidence interval, $z_{\alpha/2}$ is approximately 2.576 (from Z-tables)

Calculating the margin of error:

\[\begin{equation*} E = 2.576 \left(\dfrac{5\%}{\sqrt{50}}\right) \approx 1.82\% \end{equation*}\]

Therefore, the 99% confidence interval is:

\[\begin{equation*}CI = 7\% \pm 2.576 \left(\dfrac{5\%}{\sqrt{50}}\right) = 7\% \pm 1.82\%\end{equation*}\]

Which gives us:

\[\begin{equation*}CI = (5.18\%, 8.82\%)\end{equation*}\]

So, we can say with 99% confidence that the true average annual return of the stock is between 5.18% and 8.82%.

The standard error (SE) is approximately 0.71%, which is used to calculate the margin of error (E) and the confidence interval (CI). The final confidence interval reflects the range within which we can be 99% confident that the true average annual return lies.

Fig. 6.12 provides a visual representation of the 99% confidence interval for the average annual stock return:

The center line represents the sample mean ($\bar{x} = 7\%$), our best point estimate of the population mean.
The shaded area represents the 99% confidence interval, ranging from 5.18% to 8.82%.
The width of the shaded area is the margin of error (1.82%), indicating the precision of our estimate.

Fig. 6.12 Visualization of the 99% Confidence Interval for the Average Annual Stock Return#

This visualization helps us interpret the results in the context of stock returns:

We can be 99% confident that the true population mean annual return falls somewhere within the shaded area.
While the sample mean of 7% is our best single-value estimate, the shaded area acknowledges the uncertainty in this estimate.
The width of the shaded area reflects the high level of confidence (99%) and the variability in stock returns.

This interval is based on our sample of 50 years of returns and the known population standard deviation of 5%, using the normal distribution as justified by the Central Limit Theorem. The wider interval compared to a 95% confidence interval reflects the higher level of confidence required.

Example 6.13

A public health researcher is evaluating the effectiveness of a new vaccine. The vaccine’s efficacy rate is known to have a population standard deviation of 10%. In a clinical trial, a random sample of 60 patients is given the vaccine, and the mean efficacy rate observed is 85%.

Calculate a 95% confidence interval for the true efficacy rate of the vaccine.

Solution: Given:

$\bar{x} = 85\%$
$\sigma = 10\%$
$n = 60$
For a 95% confidence interval, $z_{\alpha/2}$ is approximately 1.96 (from Z-tables)

Calculating the margin of error:

\[\begin{equation*} E = 1.96 \left(\dfrac{10\%}{\sqrt{60}}\right) \approx 2.53\% \end{equation*}\]

Therefore, the 95% confidence interval is:

\[\begin{equation*} CI = 85\% \pm 1.96 \left(\dfrac{10\%}{\sqrt{60}}\right) = 85\% \pm 2.53\% \end{equation*}\]

Which gives us:

\[\begin{equation*} CI = (82.47\%, 87.53\%) \end{equation*}\]

So, we can say with 95% confidence that the true efficacy rate of the vaccine is between 82.47% and 87.53%.

Fig. 6.13 provides a visual representation of the 95% confidence interval for the vaccine efficacy rate:

The center line represents the sample mean ($\bar{x} = 85\%$), our best point estimate of the population mean efficacy rate.
The shaded area represents the 95% confidence interval, ranging from 82.47% to 87.53%.
The width of the shaded area is the margin of error (2.53%), indicating the precision of our estimate.

Fig. 6.13 Visualization of the 95% Confidence Interval for the Vaccine Efficacy Rate#

This visualization helps us interpret the results in the context of vaccine efficacy:

We can be 95% confident that the true population mean efficacy rate of the vaccine falls somewhere within the shaded area.
While the sample mean of 85% is our best single-value estimate, the shaded area acknowledges the uncertainty in this estimate.
The relatively narrow width of the shaded area suggests a fairly precise estimate, which can be attributed to the large sample size (60) and the known population standard deviation.

This interval is based on our sample of 60 patients and the known population standard deviation of 10%, using the normal distribution as justified by the Central Limit Theorem. The narrow interval provides strong evidence for the vaccine’s effectiveness, as even the lower bound of the interval (82.47%) represents a high efficacy rate.

Example 6.14

A university is analyzing the performance of its students in a standardized test. The test scores are known to have a population standard deviation of 12 points. The university takes a random sample of 25 students and finds that the mean test score is 68 points.

Calculate a 90% confidence interval for the true mean test score of the university’s student population.

Solution: Given:

$\bar{x} = 68$ points
$\sigma = 12$ points
$n = 25$
For a 90% confidence interval, $z_{\alpha/2}$ is approximately 1.645 (from Z-tables)

Calculating the margin of error:

\[\begin{equation*} E = 1.645 \left(\dfrac{12}{\sqrt{25}}\right) \approx 3.948 \end{equation*}\]

Therefore, the 90% confidence interval is:

\[\begin{equation*} CI = 68 \pm 1.645 \left(\dfrac{12}{\sqrt{25}}\right) = 68 \pm 3.948 \end{equation*}\]

Which gives us:

\[\begin{equation*} CI = (64.052, 71.948) \end{equation*}\]

So, we can say with 90% confidence that the true mean test score of the university’s student population is between 64.052 and 71.948 points.

Fig. 6.14 provides a visual representation of the 90% confidence interval for the mean test score:

The center line represents the sample mean ($\bar{x} = 68$ points), our best point estimate of the population mean test score.
The shaded area represents the 90% confidence interval, ranging from 64.052 to 71.948 points.
The width of the shaded area is the margin of error (3.948 points), indicating the precision of our estimate.

Fig. 6.14 Visualization of the 90% Confidence Interval for the Mean Test Score#

This visualization helps us interpret the results in the context of student test scores:

We can be 90% confident that the true population mean test score falls somewhere within the shaded area.
While the sample mean of 68 points is our best single-value estimate, the shaded area acknowledges the uncertainty in this estimate.
The width of the shaded area reflects the level of confidence (90%) and the variability in test scores, as well as the relatively small sample size.

This interval is based on our sample of 25 students and the known population standard deviation of 12 points, using the normal distribution as justified by the Central Limit Theorem. The interval provides a range of plausible values for the true mean test score of the university’s student population, which can be useful for educational assessment and planning purposes.

Example 6.15

An agricultural scientist is studying the yield of a particular strain of wheat. From previous studies, the population standard deviation of the wheat yield per acre is known to be 3 bushels. The scientist collects a random sample of 35 fields and finds that the mean yield is 50 bushels per acre.

Calculate a 98% confidence interval for the true mean yield of this strain of wheat per acre.

Solution: Given:

$\bar{x} = 50$ bushels per acre
$\sigma = 3$ bushels per acre
$n = 35$
For a 98% confidence interval, $z_{\alpha/2}$ is approximately 2.33 (from Z-tables)

Calculating the margin of error:

\[\begin{equation*} E = 2.33 \left(\dfrac{3}{\sqrt{35}}\right) \approx 1.18 \end{equation*}\]

Therefore, the 98% confidence interval is:

\[\begin{equation*}CI = 50 \pm 2.33 \left(\dfrac{3}{\sqrt{35}}\right) = 50 \pm 1.18\end{equation*}\]

Which gives us:

\[\begin{equation*}CI = (48.82, 51.18)\end{equation*}\]

So, we can say with 98% confidence that the true mean yield of this strain of wheat per acre is between 48.82 and 51.18 bushels.

Fig. 6.15 provides a visual representation of the 98% confidence interval for the mean wheat yield:

The center line represents the sample mean ($\bar{x} = 50$ bushels per acre), our best point estimate of the population mean yield.
The shaded area represents the 98% confidence interval, ranging from 48.82 to 51.18 bushels per acre.
The width of the shaded area is the margin of error (1.18 bushels), indicating the precision of our estimate.

Fig. 6.15 Visualization of the 98% Confidence Interval for the Mean Wheat Yield#

This visualization helps us interpret the results in the context of wheat yield:

We can be 98% confident that the true population mean yield of this strain of wheat falls somewhere within the shaded area.
While the sample mean of 50 bushels per acre is our best single-value estimate, the shaded area acknowledges the uncertainty in this estimate.
The relatively narrow width of the shaded area suggests a precise estimate, which can be attributed to the large sample size (35), the known population standard deviation, and the high confidence level (98%).

This interval is based on our sample of 35 fields and the known population standard deviation of 3 bushels per acre, using the normal distribution as justified by the Central Limit Theorem. The narrow interval provides strong evidence for the consistency of this wheat strain’s yield, which can be valuable information for farmers and agricultural planners.

6.2.3. Determining the Required Sample Size#

The margin of error (E) is a crucial concept in statistics that indicates the degree of accuracy in estimating an unknown population parameter. It quantifies the range within which the true value of the parameter is expected to lie with a certain level of confidence.

6.2.4. Sample Size for Estimating the Population Mean ($\mu$)#

When constructing a confidence interval for the population mean, the sample size needed can be calculated using the following formula:

(6.3)#\[\begin{equation} n = \left( \dfrac{z_{\alpha/2} \cdot \sigma}{E} \right)^{2} \end{equation}\]

Where:

$n$ is the sample size to be determined.
$z_{\alpha/2}$ is the z-score corresponding to the desired confidence level (1 − $\alpha$).
$\sigma$ is the population standard deviation.
$E$ is the margin of error.

After calculating the value of $n$, it should be rounded up to the nearest whole number to ensure the sample size is sufficient to achieve the specified margin of error.

Example 6.16

A research team is conducting a study on the concentration of a particular pesticide in agricultural runoff water in Alberta. Historical data suggests that the population standard deviation for the pesticide concentration is approximately $\sigma = 1.8$ ppm (parts per million).

The team wants to estimate the average concentration of the pesticide with a 95% confidence interval and a total width of 2 ppm (which means the margin of error, $E$, is $1$ ppm).

How many samples should the research team collect to achieve their desired confidence interval?

Solution: To solve this, we’ll use the estimation formula for determining the sample size for estimating the population mean. Given:

$\sigma = 1.8$ ppm (population standard deviation)
$E = 1$ ppm (margin of error)
For a 95% confidence level, the z-score ($z_{\alpha/2}$) is approximately 1.96 (This value comes from standard z-tables corresponding to a 95% confidence level).

Plugging these values into the formula:

\[\begin{equation*} n = \left( \dfrac{1.96 \cdot 1.8}{1} \right)^{2} = \left( 3.528 \right)^{2} \approx 12.447 \end{equation*}\]

Since we can’t collect a fraction of a sample, we round up to the nearest whole number:

Sample Size (n) = 13

Therefore, the research team should collect 13 samples to estimate the average pesticide concentration with a 95% confidence interval and a margin of error of 1 ppm.

Note: Verifying the Sample Size Calculation

To validate that a sample size of 13 achieves the desired margin of error, we can calculate the actual margin of error and compare it to the target. Let’s examine this process step-by-step:

Given parameters:
- Sample size ($n$) = 13
- Population standard deviation ($\sigma$) = 1.8 ppm
- Confidence level = 95% ($z_{\alpha/2}$ = 1.96)
- Target margin of error ($E$) = 1 ppm
Margin of error formula: $E = z_{\alpha/2} \cdot (\sigma / \sqrt{n})$
Calculation:

\[\begin{align*} E &= 1.96 \cdot (1.8 / \sqrt{13}) \\ &= 1.96 \cdot (1.8 / 3.6056) \\ &= 1.96 \cdot 0.4992 \\ &\approx 0.9785 \text{ ppm} \end{align*}\]
Analysis: The calculated margin of error (0.9785 ppm) is slightly smaller than the target (1 ppm), confirming that a sample size of 13 meets the precision requirement.

This result validates that rounding up from the theoretical 12.447 to 13 samples in the original calculation was appropriate and yields the desired level of precision.

For comparison:

With $n = 12$, the margin of error would be:

\[\begin{align*} E &= 1.96 \cdot (1.8 / \sqrt{12}) \\ &\approx 1.0184 \text{ ppm} \end{align*}\]

This exceeds the target margin of error, demonstrating why 12 samples are insufficient.

Conclusion: A sample size of 13 is optimal as it:

Meets the required precision (margin of error ≤ 1 ppm)
Represents the smallest whole number of samples that achieves this precision
Provides a slight buffer in precision, enhancing the reliability of the study’s results

This verification underscores the importance of careful sample size calculation in ensuring statistical validity while optimizing resource use in research studies.

Example 6.17

A health organization wants to estimate the average time spent on physical activity per week by adults in Calgary. They have previous studies indicating that the standard deviation for weekly physical activity time is $\sigma = 1.2$ hours.

The organization aims to construct a 90% confidence interval for the average weekly physical activity time with a margin of error of $E = 0.5$ hours.

How many adults should be surveyed to meet the organization’s requirements for the confidence interval?

Solution: To solve this, we’ll use the estimation formula for determining the sample size for estimating the population mean. Given:

$\sigma = 1.2$ hours (population standard deviation)
$E = 0.5$ hours (margin of error)
For a 90% confidence level, the z-score ($z_{\alpha/2}$) is approximately 1.645 (This value is derived from standard z-tables corresponding to a 90% confidence level).

Plugging in the values:

\[\begin{equation*} n = \left( \dfrac{1.645 \cdot 1.2}{0.5} \right)^{2} = \left( 3.948 \right)^{2} \approx 15.587 \end{equation*}\]

Rounding up to the nearest whole number:

Sample Size (n) = 16

Therefore, the health organization should survey 16 adults to estimate the average weekly physical activity time with a 90% confidence interval and a margin of error of 0.5 hours.

Example 6.18

A nutritionist is interested in the average sugar content of a popular brand of cereal sold in Calgary. To ensure accuracy, the nutritionist wants to calculate a 99% confidence interval for the average sugar content. Previous studies indicate that the standard deviation of sugar content in this cereal brand is $\sigma = 0.8$ grams per serving.

The nutritionist desires a margin of error of $E = 0.2$ grams per serving for the confidence interval.

How many cereal boxes should the nutritionist test to estimate the average sugar content with the specified confidence and precision?

Solution: To solve this, we’ll use the estimation formula for determining the sample size for estimating the population mean. Given:

$\sigma = 0.8$ grams per serving (population standard deviation)
$E = 0.2$ grams per serving (margin of error)
For a 99% confidence level, the z-score ($z_{\alpha/2}$) is approximately 2.576 (This value is obtained from standard z-tables corresponding to a 99% confidence level).

Plugging in the values:

\[\begin{equation*} n = \left( \dfrac{2.576 \cdot 0.8}{0.2} \right)^{2} = \left( 10.304 \right)^{2} \approx 106.192 \end{equation*}\]

Rounding up to the nearest whole number:

Sample Size (n) = 107

Therefore, the nutritionist should test 107 cereal boxes to estimate the average sugar content with a 99% confidence interval and a margin of error of 0.2 grams per serving.

Confidence Intervals for a Single Population Mean When \sigma is Known

Contents

6.2. Confidence Intervals for a Single Population Mean When \(\sigma\) is Known#

6.2.1. Margin of Error (E) for Estimating \(\mu\)#

6.2.2. Procedure for Estimating a Population Mean with Known Standard Deviation#

6.2.3. Determining the Required Sample Size#

6.2.4. Sample Size for Estimating the Population Mean (\(\mu\))#