2.1.3.2.5 - Conditional Probability

A conditional probability is the probability of one event occurring given that a second event is known to have occurred. This is communicated using the symbol \(\mid\) which is read as "given." For example, \(P(A\mid B)\) is read as "Probability of A given B."

A conditional probability can be computed using a two-way contingency table. In the examples below, note that we're only interested in the events in one row or column.

Example: PA Resident given Undergraduate Section

The two-way contingency table below displays Penn State World Campus enrollments from Fall 2019 in terms of academic level (undergraduate and graduate) and state residency (Pennsylvania and non-Pennsylvania).

Two-Way Table of Academic Level and State Residency
	Pennsylvania	Non-Pennsylvania	Total
Undergraduate	3757	4603	8360
Graduate	2253	4074	6327
Total	6010	8677	14687

Given an individual is an undergraduate student, what is the probability they are a Pennsylvania resident?

We know the individual is an undergraduate student, so we will only look at the row containing the 8360 undergraduate students. Of those 8360 undergraduate students, 3757 were Pennsylvania residents.

\(P(PA \mid Undergrad) = \dfrac{3757}{8360}=0.449\)

Given an individual is a Pennsylvania resident, what is the probability they are an undergraduate student?

Note that most cases, \(P(A\mid B) \ne P(B \mid A)\). This question is different from the first question because the two events are flipped. Here, we know the individual is a Pennsylvania resident, we we will only look at the column containing the 6010 Pennsylvania residents. Of those 6010 Pennsylvania residents, 3757 were undergraduate students.

\(P(Undergrad \mid PA) = \dfrac{3757}{6010}=0.625\)

What proportion of graduate students are Pennsylvania residents?

This question is worded slightly differently, but it is also a conditional probability. This translates to \(P(PA \mid Graduate)\). Of the 6327 graduate students, 2253 were Pennsylvania residents.

\(P(PA \mid Graduate) = \dfrac{2253}{6327}=0.356\)

Sensitivity & Specificity Section

Sensitivity and specificity are two specific types of conditional probabilities that are often applied in situations involving testing (e.g., medical testing for a given condition). Sensitivity is the probability of testing positive given that one actually has the condition. Specificity is the probability of testing negative given that one actually does not have the condition. Ideally, we would like both sensitivity and specificity rates to be high.

Example: Sensitivity & Specificity Section

Compute the sensitivity and specificity of the test data presented in the following two-way contingency table.

Two-Way Table of Test Results and Actual Health
	Actually Sick	Actually Healthy	Total
Tested Positive	15	5	20
Tested Negative	2	19	21
Total	17	24	41

Sensitivity is the proportion of all people who were actually sick who tested positive. As a conditional probability, \(P(positive \mid sick)\). There were 17 people in the sample who were actually sick. Of those, 15 tested positive.

\(Sensitivity = \dfrac{15}{17}=0.882\)

Specificity is the proportion of all people who were actually healthy who tested negative. As a conditional probability, \(P(negative \mid healthy)\). There were 24 people in the sample who were actually healthy. Of those, 19 tested negative.

\(Specificity = \dfrac {19}{24}=0.792\)