Contingency Table

3.5. Contingency Table#

A contingency table, also known as a cross-tabulation or crosstab, is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables. It is used to study the relationship between two or more categorical variables by organizing data into rows and columns, making it easier to understand the correlation between the variables.

Each cell in a contingency table represents the count or frequency of a specific combination of variable outcomes. The margins of the table provide the total counts of each variable category, which are useful for calculating probabilities and conditional probabilities.

Example 3.24 (Pizza Preference)

Imagine a survey conducted in Calgary to determine the pizza preference among university students. The survey categorizes students into those who love pizza and those who do not. Additionally, it considers whether the students live on-campus or off-campus. The hypothetical data from the survey is as follows:

Loves Pizza

Does Not Love Pizza

Lives On-Campus

120

80

Lives Off-Campus

180

120

In this sample of 500 students, 300 are categorized as loving pizza, while the remaining 200 do not love pizza. When considering their living situation, 200 students live on-campus, and 300 live off-campus.

Example 3.25 (Pizza Preference Probability)

Use the data from Example 3.24 to calculate the indicated probabilities.

  • a. Find \( P(\text{Student loves pizza}) \).

  • b. Find \( P(\text{Student lives off-campus}) \).

  • c. Find \( P(\text{Student lives off-campus AND loves pizza}) \).

  • d. Find \( P(\text{Student loves pizza OR lives off-campus}) \).

Solution:

Loves Pizza

Does Not Love Pizza

Total

Lives On-Campus

120

80

200

Lives Off-Campus

180

120

300

Total

300

200

500

a) Probability of a Student Loving Pizza: The total number of students who love pizza is 300 out of 500 students.

\[\begin{equation*} P(\text{Student loves pizza}) = \frac{300}{500} = \frac{3}{5} \approx 0.6000 \end{equation*}\]

b) Probability of a Student Living Off-Campus: The total number of students living off-campus is 300 out of 500 students.

\[\begin{equation*} P(\text{Student lives off-campus}) = \frac{300}{500} = \frac{3}{5} \approx 0.6000 \end{equation*}\]

c) Probability of a Student Living Off-Campus AND Loving Pizza: The number of students who live off-campus and love pizza is 180 out of 500 students.

\[\begin{equation*} P(\text{Student lives off-campus AND loves pizza}) = \frac{180}{500} = \frac{9}{25} \approx 0.3600 \end{equation*}\]

d) Probability of a Student Loving Pizza OR Living Off-Campus: To find this probability, we add the probabilities of each individual event and subtract the probability of their intersection to avoid double-counting.

\[\begin{align*} P(\text{Student loves pizza OR lives off-campus}) &= P(\text{Student loves pizza}) \\ & + P(\text{Student lives off-campus}) \\ & - P(\text{Student lives off-campus AND loves pizza}) \\ & = \frac{3}{5} + \frac{3}{5} - \frac{9}{25} = \frac{21}{25} \approx 0.8400 \end{align*}\]

Example 3.26 (Study Location Preferences Among Students)

A survey was conducted to find out the preferred study locations among students at a university. The students could choose between studying at the library or at a coffee shop. Additionally, the survey also recorded whether the students were undergraduate or graduate students. The data collected is summarized in the contingency table below:

Prefers Library

Prefers Coffee Shop

Undergraduate Students

160

40

Graduate Students

120

180

Using the data in the table, calculate the following probabilities:

  • a. Find \( P(\text{Student prefers the library}) \).

  • b. Find \( P(\text{Student is an undergraduate}) \).

  • c. Find \( P(\text{Student prefers the library AND is a graduate student}) \).

  • d. Find \( P(\text{Student prefers the library OR is an undergraduate student}) \).

Solution:

Prefers Library

Prefers Coffee Shop

Total

Undergraduate Students

160

40

200

Graduate Students

120

180

300

Total

280

220

500

a) Probability of a Student Preferring the Library: The total number of students who prefer the library is 280 out of 500 students.

\[\begin{equation*} P(\text{Student prefers the library}) = \frac{280}{500} = \frac{14}{25} \approx 0.5600 \end{equation*}\]

b) Probability of a Student Being an Undergraduate: The total number of undergraduate students is 200 out of 500 students.

\[\begin{equation*} P(\text{Student is an undergraduate}) = \frac{200}{500} = \frac{2}{5} \approx 0.4000 \end{equation*}\]

c) Probability of a Student Preferring the Library AND Being a Graduate Student: The number of graduate students who prefer the library is 120 out of 500 students.

\[\begin{equation*} P(\text{Student prefers the library AND is a graduate student}) = \frac{120}{500} = \frac{6}{25} \approx 0.2400 \end{equation*}\]

d) Probability of a Student Preferring the Library OR Being an Undergraduate Student: To find this probability, we add the probabilities of each individual event and subtract the probability of their intersection to avoid double-counting.

\[\begin{align*} P(\text{Student prefers the library OR is an undergraduate student}) &= P(\text{Student prefers the library}) \\ & + P(\text{Student is an undergraduate}) \\ & - P(\text{Undergraduate prefers the library}) \\ & = \frac{14}{25} + \frac{2}{5} - \frac{8}{25} = \frac{16}{25} \approx 0.6400 \end{align*}\]

Example 3.27 (Students’ Favorite Entertainment Preferences)

A survey was conducted at a university to determine students’ favorite types of entertainment. The options were Movies, Music, and Video Games. The students were also categorized by their level of study (Undergraduate or Graduate) and sex (Male or Female). The collected data is summarized in the contingency table below:

Movies

Music

Video Games

Undergraduate - Male

100

120

80

Undergraduate - Female

90

110

70

Graduate - Male

60

80

60

Graduate - Female

50

70

40

Using the data in the table, calculate the following probabilities:

  • a. Find \( P(\text{Student prefers Movies}) \).

  • b. Find \( P(\text{Undergraduate Female student}) \).

  • c. Find \( P(\text{Graduate student prefers Video Games}) \).

  • d. Find \( P(\text{Student prefers Music OR is a Male student}) \).

Solution:

Movies

Music

Video Games

Total

Undergraduate - Male

100

120

80

300

Undergraduate - Female

90

110

70

270

Graduate - Male

60

80

60

200

Graduate - Female

50

70

40

160

Total

300

380

250

930

a) Probability of a Student Preferring Movies: The total number of students who prefer movies is 300 out of 930 students.

\[\begin{equation*} P(\text{Student prefers Movies}) = \frac{300}{930} = \frac{10}{31} \approx 0.3226 \end{equation*}\]

b) Probability of an Undergraduate Female Student: The total number of undergraduate female students is 270 out of 930 students.

\[\begin{equation*} P(\text{Undergraduate Female student}) = \frac{270}{930} = \frac{9}{31} \approx 0.2903 \end{equation*}\]

c) Probability of a Graduate Student Preferring Video Games: The number of graduate students who prefer video games is 100 (60 male + 40 female) out of 930 students.

\[\begin{equation*} P(\text{Graduate student prefers Video Games}) = \frac{100}{930} = \frac{10}{93} \approx 0.1075 \end{equation*}\]

d) Probability of a Student Preferring Music OR Being a Male Student: To find this probability, we add the probabilities of each individual event and subtract the probability of their intersection to avoid double-counting.

\[\begin{align*} P(\text{Student prefers Music OR is a Male student}) &= P(\text{Music}) + P(\text{Male}) - P(\text{Male and Music}) \\ &= \frac{380}{930} + \frac{500}{930} - \frac{200}{930} \\ &= \frac{38}{93} + \frac{50}{93} - \frac{20}{93} \\ &= \frac{68}{93} \approx 0.7312 \end{align*}\]