Lesson 2: Properties of Probability

Lesson 2: Properties of Probability

Overview

In this lesson, we learn the fundamental concepts of probability. It is this lesson that will allow us to start putting our first tools into our new probability toolbox.

Objectives

Upon completion of this lesson, you should be able to:

  • Learn why an understanding of probability is so critically important to the advancement of most kinds of scientific research.
  • Learn the definition of an event.
  • Learn how to derive new events by taking subsets, unions, intersections, and/or complements of already existing events.
  • Learn the definitions of specific kinds of events, namely empty events, mutually exclusive (or disjoint) events, and exhaustive events.
  • Learn the formal definition of probability.
  • Learn three ways — the person opinion approach, the relative frequency approach, and the classical approach — of assigning a probability to an event.
  • Learn five fundamental theorems, which when applied, allow us to determine probabilities of various events.
  • Get lots of practice calculating probabilities of various events.

2.1 - Why Probability?

2.1 - Why Probability?

In the previous lesson, we discussed the big picture of the course without really delving into why the study of probability is so vitally important to the advancement of science. Let's do that now by looking at two examples.

Example 2-1

tickets for an event

Suppose that the Penn State Committee for the Fun of Students claims that the average number of concerts attended yearly by Penn State students is 2. Then, suppose that we take a random sample of 50 Penn State students and determine that the average number of concerts attended by the 50 students is:

\(\dfrac{1+4+3+\ldots+2}{50}=3.2\)

that is, 3.2 concerts per year. That then begs the question: if the actual population average is 2, how likely is it that we'd get a sample average as large as 3.2?

What do you think? Is it likely or not likely? If the answer to the question is ultimately "not likely", then we have two possible conclusions:

  1. Either: The true population average is indeed 2. We just happened to select a strange and unusual sample.
  2. Or: Our original claim of 2 is wrong. Reject the claim, and conclude that the true population average is more than 2.

Of course, I don't raise this example simply to draw conclusions about the frequency with which Penn State students attend concerts. Instead I raise it to illustrate that in order to use a random sample to draw a conclusion about a population, we need to be able to answer the question "how likely...?", that is "what is the probability...?". Let's take a look at another example.

Example 2-2

Suppose that the Penn State Parking Office claims that two-thirds (67%) of Penn State students maintain a car in State College. Then, suppose we take a random sample of 100 Penn State students and determine that the proportion of students in the sample who maintain a car in State College is:

\(\dfrac{69}{100}=0.69\)

that is, 69%. Now we need to ask the question: if the actual population proportion is 0.67, how likely is it that we'd get a sample proportion of 0.69?

What do you think? Is it likely or not likely? If the answer to the question is ultimately "likely," then we have just one possible conclusion: The Parking Office's claim is reasonable. Do not reject their claim.

Again, I don't raise this example simply to draw conclusions about the driving behaviors of Penn State students. Instead I raise it to illustrate again that in order to use a random sample to draw a conclusion about a population, we need to be able to answer the question "how likely...?", that is "what is the probability...?".

Summary

So, in summary, why do we need to learn about probability? Any time we want to answer a research question that involves using a sample to draw a conclusion about some larger population, we need to answer the question "how likely is it...?" or "what is the probability...?". To answer such a question, we need to understand probability, probability rules, and probability models. And that's exactly what we'll be working on learning throughout this course.

Now that we've got the motivation for this probability course behind us, let's delve right in and start filling up our probability tool box!


2.2 - Events

2.2 - Events

Recall that given a random experiment, then the outcome space (or sample space) \(\mathbf{S}\) is the collection of all possible outcomes of the random experiment.

Event
denoted with capital letters \(A, B, C\), ... — is just a subset of the sample space \(\mathbf{S}\). That is, for example \(A\subset \mathbf{S}\), where "\(\subset\)" denotes "is a subset of."

Example 2-3

Suppose we randomly select a student, and ask them "how many pairs of jeans do you own?". In this case our sample space \(\mathbf{S}\) is:

\(\mathbf{S} = \{0, 1, 2, 3, ...\}\)

We could theoretically put some realistic upper limit on that sample space, but who knows what it would be? So, let's leave it as accurate as possible. Now let's define some events.

If \(A\) is the event that a randomly selected student owns no jeans:

\(A\) = student owns none = \(\{0\}\)

If \(B\) is the event that a randomly selected student owns some jeans:

\(B\) = student owns some = \(\{1, 2, 3, ...\}\)

If \(C\) is the event that a randomly selected student owns no more than five pairs of jeans:

\(C\) = student owns no more than five pairs = \(\{0, 1, 2, 3, 4, 5\}\)

And, if \(D\) is the event that a randomly selected student owns an odd number of pairs of jeans:

\(D\) = student owns an odd number = \(\{1, 3, 5, ...\}\)

Review

Since events and sample spaces are just sets, let's review the algebra of sets:

  1. \(\emptyset\) is the "null set" (or "empty set")
  2. \(C\cup D\) = "union" = the elements in \(C\) or \(D\) or both
  3. \(A\cap B\) = "intersection" = the elements in \(A\) and \(B\). If (A\cap B=\emptyset\), then \(A\) and \(B\) are called "mutually exclusive events" (or "disjoint events").
  4. \(D^\prime=D^c\)= "complement" = the elements not in \(D\)
  5. If \(E\cup F\cup G...=\mathbf{S}\), then \(E, F, G\), and so on are called "exhaustive events."

Example 2-3 Continued

Let's revisit the previous "how many pairs of jeans do you own?" example. That is, suppose we randomly select a student, and ask them "how many pairs of jeans do you own?". In this case our sample space S is:

\(\mathbf{S} = \{0, 1, 2, 3, ...\}\)

Now, let's define some composite events.

The union of events \(C\) and \(D\) is the event that a randomly selected student either owns no more than five pairs or owns an odd number. That is:

\(C\cup D=\{0, 1, 2, 3, 4, 5, 7, 9, ...\}\)

The intersection of events \(A\) and \(B\) is the event that a randomly selected student owes no pairs and owes some pairs of jeans. That is:

\(A\cap B = \{0\} \cap \{1, 2, 3, ...\}\) = the empty set \(\emptyset\)

The complement of event \(D\) is the event that a randomly selected student owes an even number of pairs of jeans. That is:

\(D^\prime= \{0, 2, 4, 6, ...\}\)

If \(E = \{0, 1\}\), \(F = \{2, 3\}\), \(G = \{4, 5\}\) and so on, so that:

\(E\cup F\cup G\cup ...=\mathbf{S}\)

then \(E, F, G\), and so on are exhaustive events.


2.3 - What is Probability (Informally)?

2.3 - What is Probability (Informally)?

We'll get to the more formal definition of probability soon, but let's think about probability just informally for a moment. How about this as an informal definition?

Probability
a number between 0 and 1
a number closer to 0 means "not likely"
a number closer to 1 means "quite likely"
If the probability of an event is exactly 0, then the event can't occur. If the probability of an event is exactly 1, then the event will definitely occur.

Try It!

Can you think of an event that cannot occur? Can you think of an event that will definitely occur?
An example of an event that can’t occur is rolling a seven from a fair six-sided die with face values {1, 2, 3, 4, 5, 6}. An example of an event that will definitely occur is the sun setting tonight.

2.4 - How to Assign Probability to Events

2.4 - How to Assign Probability to Events

We know that probability is a number between 0 and 1. How does an event get assigned a particular probability value? Well, there are three ways of doing so:

  1. the personal opinion approach
  2. the relative frequency approach
  3. the classical approach

On this page, we'll take a look at each approach.

The Personal Opinion Approach

This approach is the simplest in practice, but therefore it also the least reliable. You might think of it as the "whatever it is to you" approach. Here are some examples:

  • "I think there is an 80% chance of rain today."
  • "I think there is a 50% chance that the world's oil reserves will be depleted by the year 2100."
  • "I think there is a 1% chance that the men's basketball team will end up in the Final Four sometime this decade."

Example 2-4

At which end of the probability scale would you put the probability that:

  1. one day you will die?
  2. you can swim around the world in 30 hours?
  3. you will win the lottery someday?
  4. a randomly selected student will get an A in this course?
  5. you will get an A in this course?

Answer

I think we'd all agree that the probability that you will die one day is 1. On the other hand, the probability that you can swim around the world in 30 hours is nearly 0, as is the probability that you will win the lottery someday. I am going to say that the probability that a randomly selected student will get an A in this course is a probability in the 0.20 to 0.30 range. I'll leave it to you think about the probability that you will get an A in this course.

The Relative Frequency Approach

The relative frequency approach involves taking the follow three steps in order to determine P(A), the probability of an event A:

  1. Perform an experiment a large number of times, n, say.
  2. Count the number of times the event A of interest occurs, call the number N(A), say.
  3. Then, the probability of event A equals:

\(P(A)=\dfrac{N(A)}{n}\)

The relative frequency approach is useful when the classical approach that is described next can't be used.

Example 2-5

Penny

When you toss a fair coin with one side designated as a "head" and the other side designated as a "tail", what is the probability of getting a head?

Answer

I think you all might instinctively reply \(\dfrac{1}{2}\). Of course, right? Well, there are three people who once felt compelled to determine the probability of getting a head using the relative frequency approach:

Coin Tosser

n, the number of tosses made

N(H), the number of heads tossed P(H>)
Count Buffon 4,040 2,048 0.5069
Karl Pearson 24,000 12,012 0.5005
John Kerrich 10,000 5,067 0.5067

As you can see, the relative frequency approach yields a pretty good approximation to the 0.50 probability that we would all expect of a fair coin. Perhaps this example also illustrates the large number of times an experiment has to be conducted in order to get reliable results when using the relative frequency approach.

By the way, Count Buffon (1707-1788) was a French naturalist and mathematician who often pondered interesting probability problems. His most famous question

Suppose we have a floor made of parallel strips of wood, each the same width, and we drop a needle onto the floor. What is the probability that the needle will lie across a line between two strips?

came to be known as Buffon's needle problem. Karl Pearson (1857-1936) effectively established the field of mathematical statistics. And, once you hear John Kerrich's story, you might understand why he, of all people, carried out such a mind-numbing experiment. He was an English mathematician who was lecturing at the University of Copenhagen when World War II broke out. He was arrested by the Germans and spent the war interned in a prison camp in Denmark. To help pass the time he performed a number of probability experiments, such as this coin-tossing one.

Example 2-6

trees

Some trees in a forest were showing signs of disease. A random sample of 200 trees of various sizes was examined yielding the following results:

Type Disease free Doubtful Diseased Total
Large 35 18 15 68
Medium 46 32 14 92
Small 24 8 8 40
Total 105 58 37 200

What is the probability that one tree selected at random is large?

Answer

There are 68 large trees out of 200 total trees, so the relative frequency approach would tell us that the probability that a tree selected at random is large is 68/200 = 0.34.

What is the probability that one tree selected at random is diseased?

Answer

There are 37 diseased trees out of 200 total trees, so the relative frequency approach would tell us that the probability that a tree selected at random is diseased is 37/200 = 0.185.

What is the probability that one tree selected at random is both small and diseased?

Answer

There are 8 small, diseased trees out of 200 total trees, so the relative frequency approach would tell us that the probability that a tree selected at random is small and diseased is 8/200 = 0.04.

What is the probability that one tree selected at random is either small or disease-free?

Answer

There are 121 trees (35 + 46 + 24 + 8 + 8) out of 200 total trees that are either small or disease-free, so the relative frequency approach would tell us that the probability that a tree selected at random is either small or disease-free is 121/200 = 0.605.

What is the probability that one tree selected at random from the population of medium trees is doubtful of disease?

Answer

There are 92 medium trees in the sample. Of those 92 medium trees, 32 have been identified as being doubtful of disease. Therefore, the relative frequency approach would tell us that the probability that a medium tree selected at random is doubtful of disease is 32/92 = 0.348.

The Classical Approach

The classical approach is the method that we will investigate quite extensively in the next lesson. As long as the outcomes in the sample space are equally likely (!!!), the probability of event \(A\) is:

\(P(A)=\dfrac{N(A)}{N(\mathbf{S})}\)

where \(N(A)\) is the number of elements in the event \(A\), and \(N(\mathbf{S})\) is the number of elements in the sample space \(\mathbf{S}\). Let's take a look at an example.

Example 2-7

Suppose you draw one card at random from a standard deck of 52 cards. Recall that a standard deck of cards contains 13 face values (Ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, and King) in 4 different suits (Clubs, Diamonds, Hearts, and Spades) for a total of 52 cards. Assume the cards were manufactured to ensure that each outcome is equally likely with a probability of 1/52. Let \(A\) be the event that the card drawn is a 2, 3, or 7. Let \(B\) be the event that the card is a 2 of hearts (H), 3 of diamonds (D), 8 of spades (S) or king of clubs (C). That is:

  • \(A= \{x: x \text{ is a }2, 3,\text{ or }7\}\)
  • \(B = \{x: x\text{ is 2H, 3D, 8S, or KC}\}\)

Then:

  1. What is the probability that a 2, 3, or 7 is drawn?
  2. What is the probability that the card is a 2 of hearts, 3 of diamonds, 8 of spades or king of clubs?
  3. What is the probability that the card is either a 2, 3, or 7 or a 2 of hearts, 3 of diamonds, 8 of spades or king of clubs?
  4. What is \(P(A\cap B)\)?

Answer


2.5 - What is Probability (Formally)?

2.5 - What is Probability (Formally)?

Previously, we defined probability informally. Now, let's take a look at a formal definition using the “axioms of probability.”

Probability of the Event

Probability is a (real-valued) set function \(P\) that assigns to each event \(A\) in the sample space \(\mathbf{S}\) a number \(P(A)\), called the probability of the event \(A\), such that the following hold:

  1. The probability of any event \(A\) must be nonnegative, that is, \(P(A)\ge 0\).
  2. The probability of the sample space is 1, that is, \(P(\mathbf{S})=1\).
  3. Given mutually exclusive events \(A_1, A_2, A_3, ...\) that is, where \(A_i\cap A_j=\emptyset\), for \(i\ne j\),

    \(P(A_1\cup A_2 \cup \cdots \cup A_k)=P(A_1)+P(A_2)+\cdots+P(A_k)\)

    \(P(A_1\cup A_2 \cup \cdots )=P(A_1)+P(A_2)+\cdots \)

    1. the probability of a finite union of the events is the sum of the probabilities of the individual events, that is:
    2. the probability of a countably infinite union of the events is the sum of the probabilities of the individual events, that is:

Example 2-8

Suppose that a Stat 414 class contains 43 students, such that 1 is a Freshman, 4 are Sophomores, 20 are Juniors, 9 are Seniors, and 9 are Graduate students:

Status Fresh Soph Jun Sen Grad Total
Count 1 4 20 9 9 43
Proportion 0.02 0.09 0.47 0.21 0.21  

Randomly select one student from the Stat 414 class. Defining the following events:

  • Fr = the event that a Freshman is selected
  • So = the event that a Sophomore is selected
  • Ju = the event that a Junior is selected
  • Se = the event that a Senior is selected
  • Gr = the event that a Graduate student is selected

The sample space is S = (Fr, So, Ju, Se, Gr}. Using the relative frequency approach to assigning probability to the events:

  • P(Fr) = 0.02
  • P(So) = 0.09
  • P(Ju) = 0.47
  • P(Se) = 0.21
  • P(Gr) = 0.21

Let's check to make sure that each of the three axioms of probability are satisfied.


2.6 - Five Theorems

2.6 - Five Theorems

Now, let's use the axioms of probability to derive yet more helpful probability rules. We'll work through five theorems in all, in each case first stating the theorem and then proving it. Then, once we've added the five theorems to our probability tool box, we'll close this lesson by applying the theorems to a few examples.


2.7 - Some Examples

2.7 - Some Examples

Example 2-9

Construction Equipment

A company has bid on two large construction projects. The company president believes that the probability of winning the first contract is 0.6, the probability of winning the second contract is 0.4, and the probability of winning both contracts is 0.2.

  1. What is the probability that the company wins at least one contract?
  2. What is the probability that the company wins the first contract but not the second contract?
  3. What is the probability that the company wins neither contract?
  4. What is the probability that the company wins exactly one contract?

Example 2-10

If it is known that \(A\subseteq B\), what can be definitively said about \(P(A\cap B)\)?

Example 2-11

If 7% of the population smokes cigars, 28% of the population smokes cigarettes, and 5% of the population smokes both, what percentage of the population smokes neither cigars nor cigarettes?


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility