Let's start our investigation of conditional distributions by using an example to help enlighten us about the distinction between a joint (bivariate) probability distribution and a conditional probability distribution.

### Example 19-1

A Safety Officer for an auto insurance company in Connecticut was interested in learning how the extent of an individual's injury in an automobile accident relates to the type of safety restraint the individual was wearing at the time of the accident. As a result, the Safety Officer used statewide ambulance and police records to compile the following two-way table of joint probabilities:

For the sake of understanding the Safety Officer's terminology, let's assume that "Belt only" means that the person was only using the lap belt, whereas "Belt and Harness" should be taken to mean that the person was using a lap belt and shoulder strap. (These data must have been collected a loooonnnggg time ago when such an option was legal!) Also, note that the Safety Officer created the random variable \(X\), the extent of injury, by arbitrarily assigning values 0, 1, 2, and 3 to each of the possible outcomes None, Minor, Major, and Death. Similarly, the Safety Officer created the random variable \(Y\), the type of restraint, by arbitrarily assigning values 0, 1, and 2 to each of the possible outcomes None, Belt Only, and Belt and Harness.

Among other things, the Safety Officer was interested in answering the following questions:

- What is the probability that a randomly selected person in an automobile accident was wearing a seat belt and had only a minor injury?
- If a randomly selected person wears no restraint, what is the probability of death?
- If a randomly selected person sustains no injury, what is the probability the person was wearing a belt and harness?

Before we can help the Safety Officer answer his questions, we could benefit from a couple of (informal) definitions under our belt.

*Definition*

**joint (bivariate) probability distribution**describes the probability that a randomly selected person from the

*population*has the

*two characteristics*of interest.

There is actually nothing really new here. We should know by now not only informally, but also formally, the definition of a bivariate probability distribution.

### Example (continued)

What is the probability a randomly selected person in an accident was wearing a seat belt and had only a minor injury?

#### Solution

Let \(A\) = the event that a randomly selected person in a car accident has a minor injury. Let \(B\) = the event that the randomly selected person was wearing only a seat belt. Then, just reading the value right off of the Safety Officer's table, we get:

\(P(A\text{ and }B)=P(X=1, Y=1)=f(1, 1)=0.16\)

That is, there is a 16% chance that a randomly selected person in an accident is wearing a seat belt *and* has only a minor injury.

Now, of course, in order to define the joint probability distribution of \(X\) and \(Y\) fully, we'd need to find the probability that \(X=x\) and \(Y=y\) for each element in the joint support \(S\), not just for one element \(X=1\) and \(Y=1\). But, that's not our point here. Here, we are revisiting the meaning of the joint probability distribution of \(X\) and \(Y\) just so we can distinguish between it and a conditional probability distribution.

- Conditional Probability Distribution
- A
**conditional probability distribution**is a probability distribution for a sub-population. That is, a conditional probability distribution describes the probability that a randomly selected person from a*sub-population*has the*one characteristic of interest*.

### Example (continued)

If a randomly selected person wears no restraint, what is the probability of death?

#### Solution

As you can see, the Safety Officer is wanting to know a conditional probability. So, we need to use the definition of conditional probability to calculate the desired probability. But, let's first dissect the Safety Officer's question into two parts by identifying the subpopulation and the characteristic of interest. Well, the **subpopulation** is the population of people wearing no restraints ( \(NR\) ), and the **characteristic of interest **is death (\(D\)). Then, using the definition of conditional probability, we determine that the desired probability is:

\(P(D|NR)=\dfrac{P(D \cap NR)}{P(NR)}=\dfrac{P(X=3,Y=0)}{P(Y=0)}=\dfrac{f(3,0)}{f_Y(0)}=\dfrac{0.025}{0.40}=0.0625\)

That is, there is a 6.25% chance of death of a randomly selected person in an automobile accident, *if* the person wears no restraint.

In order to define the conditional probability distribution of \(X\) given \(Y\) fully, we'd need to find the probability that \(X=x\) given \(Y=y\) for each element in the joint support \(S\), not just for one element \(X=3\) and \(Y=0\). But, again, that's not our point here. Here, we are simply trying to get the feel of how a conditional probability distribution describes the probability that a randomly selected person from a *sub-population* has the *one characteristic of interest*.

### Example (continued)

If a randomly selected person sustains no injury, what is the probability the person was wearing a seatbelt and harness?

#### Solution

Again, the Safety Officer is wanting to know a conditional probability. Let's again first dissect the Safety Officer's question into two parts by identifying the subpopulation and the characteristic of interest. Well, here, the **subpopulation** is the population of people sustaining no injury (\(NI\)), and the **characteristic of interest **is wearing a seatbelt and harness (\(SH\)). Then, again using the definition of conditional probability, we determine that the desired probability is:

\(P(SH|NI)=\dfrac{P(SH \cap NI)}{P(NI)}=\dfrac{P(X=0,Y=2)}{P(X=0)}=\dfrac{f(0,2)}{f_X(0)}=\dfrac{0.06}{0.20}=0.30\)

That is, there is a 30% chance that a randomly selected person in an automobile accident is wearing a seatbelt and harness, *if* the person sustains no injury.

Again, in order to define the conditional probability distribution of \(Y\) given \(X\) fully, we'd need to find the probability that \(Y=y\) given \(X=x\) for each element in the joint support of \(S\), not just for one element \(X=0\) and \(Y=2\). But, again, that's not our point here. Here, we are again simply trying to get the feel of how a conditional probability distribution describes the probability that a randomly selected person from a *sub-population* has the *one characteristic of interest*.