11 Geometric and Negative Binomial Distributions
Overview
In this lesson, we learn about two more specially named discrete probability distributions, namely the negative binomial distribution and the geometric distribution.
Objectives
Upon completion of this lesson, you should be able to:
- understand the derivation of the formula for the geometric probability mass function.
- explore the key properties, such as the mean and variance, of a geometric random variable.
- calculate probabilities for a geometric random variable.
- explore the key properties, such as the moment-generating function, mean and variance, of a negative binomial random variable.
- calculate probabilities for a negative binomial random variable.
- understand the steps involved in each of the proofs in the lesson.
- apply the methods learned in the lesson to new problems.
11.1 Geometric Distributions
Example 11.1 
A representative from the National Football League’s Marketing Division randomly selects people on a random street in Kansas City, Missouri until he finds a person who attended the last home football game. Let \(p\), the probability that he succeeds in finding such a person, equal 0.20. And, let \(X\) denote the number of people he selects until he finds his first success. What is the probability mass function of \(X\)?
Solution
Def. 11.1 (Geometric Distribution) Assume Bernoulli trials — that is, (1) there are two possible outcomes, (2) the trials are independent, and (3) \(p\), the probability of success, remains the same from trial to trial. Let \(X\) denote the number of trials until the first success. Then, the probability mass function of \(X\) is:
\[ f(x)=P(X=x)=(1-p)^{x-1}p \]
for \(x=1, 2, \ldots\) In this case, we say that \(X\) follows a geometric distribution.
Note that there are (theoretically) an infinite number of geometric distributions. Any specific geometric distribution depends on the value of the parameter \(p\).
11.2 Key Properties of a Geometric Random Variable
On this page, we state and then prove four properties of a geometric random variable. In order to prove the properties, we need to recall the sum of the geometric series. So, we may as well get that out of the way first.
Recall
- The sum of a geometric series is:
\[ g(r)=\sum\limits_{k=0}^\infty ar^k=a+ar+ar^2+ar^3+\cdots=\dfrac{a}{1-r}=a(1-r)^{-1} \]
- Then, taking the derivatives of both sides, the first derivative with respect to \(r\) must be:
\[ g'(r)=\sum\limits_{k=1}^\infty akr^{k-1}=0+a+2ar+3ar^2+\cdots=\dfrac{a}{(1-r)^2}=a(1-r)^{-2} \]
- And, taking the derivatives of both sides again, the second derivative with respect to \(r\) must be:
\[ g''(r)=\sum\limits_{k=2}^\infty ak(k-1)r^{k-2}=0+0+2a+6ar+\cdots=\dfrac{2a}{(1-r)^3}=2a(1-r)^{-3} \]
We’ll use the sum of the geometric series, first point, in proving the first two of the following four properties. And, we’ll use the first derivative, second point, in proving the third property, and the second derivative, third point, in proving the fourth property. Let’s jump right in now!
11.3 Geometric Examples
Example 11.2 
A representative from the National Football League’s Marketing Division randomly selects people on a random street in Kansas City, Kansas until he finds a person who attended the last home football game. Let \(p\), the probability that he succeeds in finding such a person, equal 0.20. And, let \(X\) denote the number of people he selects until he finds his first success. What is the probability that the marketing representative must select 4 people before he finds one who attended the last home football game?
Solution
To find the desired probability, we need to find \(P(X=4)\), which can be determined readily using the PMF of a geometric random variable with \(p=0.20\), \(1-p=0.80\), and \(x=4\):
\[ P(X=4)=0.80^3 \times 0.20=0.1024 \]
There is about a 10% chance that the marketing representative would have to select 4 people before he would find one who attended the last home football game.
What is the probability that the marketing representative must select more than 6 people before he finds one who attended the last home football game?
Solution
To find the desired probability, we need to find \(P(X>6)=1-P(X\le6)\), which can be determined readily using the c.d.f. of a geometric random variable with \(1-p=0.80\), and \(x=6\):
\[ P(X >6)=1-P(X \leq 6)=1-[1-0.8^6]=0.8^6=0.262 \]
There is about a 26% chance that the marketing representative would have to select more than 6 people before he would find one who attended the last home football game.
How many people should we expect (that is, what is the average number) the marketing representative needs to select before he finds one who attended the last home football game? And, while we’re at it, what is the variance?
Solution
The average number is:
\[ \mu=E(X)=\dfrac{1}{p}=\dfrac{1}{0.20}=5 \]
That is, we should expect the marketing representative to have to select 5 people before he finds one who attended the last football game. Of course, on any given try, it may take 1 person or it may take 10, but 5 is the average number. The variance is 20, as determined by:
\[ \sigma^2=\mathrm{Var}(X)=\dfrac{1-p}{p^2}=\dfrac{0.80}{0.20^2}=20 \]
11.4 Negative Binomial Distributions
Example 11.3 
(Are you growing weary of this example yet?) A representative from the National Football League’s Marketing Division randomly selects people on a random street in Kansas City, Kansas until he finds a person who attended the last home football game. Let \(p\), the probability that he succeeds in finding such a person, equal 0.20. Now, let \(X\) denote the number of people he selects until he finds \(r=3\) who attended the last home football game. What is the probability that \(X=10\)?
Solution
Def. 11.2 (Negative Binomial Distribution) Assume Bernoulli trials — that is, (1) there are two possible outcomes, (2) the trials are independent, and (3) \(p\), the probability of success, remains the same from trial to trial. Let \(X\) denote the number of trials until the \(r^{th}\) success. Then, the probability mass function of \(X\) is:
\[ f(x)=P(X=x)=\dbinom{x-1}{r-1} (1-p)^{x-r} p^r \]
for \(x=r, r+1, r+2, \ldots\). In this case, we say that \(X\) follows a negative binomial distribution.
11.5 Key Properties of a Negative Binomial Random Variable
11.6 Negative Binomial Examples
Example 11.4 
An oil company conducts a geological study that indicates that an exploratory oil well should have a 20% chance of striking oil. What is the probability that the first strike comes on the third well drilled?
Solution
To find the requested probability, we need to find \(P(X=3)\). Note that \(X\) is technically a geometric random variable, since we are only looking for one success. Since a geometric random variable is just a special case of a negative binomial random variable, we’ll try finding the probability using the negative binomial PMF In this case, \(p=0.20, 1-p=0.80, r=1, x=3\), and here’s what the calculation looks like:
\[ \begin{align*} P(X=3)&=\dbinom{3-1}{1-1}(1-p)^{3-1}p^1 \\ &=(1-p)^2 p \\ &=0.80^2\times 0.20 \\ &=0.128 \end{align*} \]
It is at the second equal sign that you can see how the general negative binomial problem reduces to a geometric random variable problem. In any case, there is about a 13% chance thathe first strike comes on the third well drilled.
What is the probability that the third strike comes on the seventh well drilled?
Solution
To find the requested probability, we need to find \(P(X=7)\), which can be readily found using the PMF of a negative binomial random variable with \(p=0.20, 1-p=0.80, x=7, r=3\):
\[ \begin{align*} P(X=7)&=\dbinom{7-1}{3-1}(1-p)^{7-3}p^3 \\ &=\dbinom{6}{2}0.80^4\times 0.20^3 \\ &=0.049 \end{align*} \]
That is, there is about a 5% chance that the third strike comes on the seventh well drilled.
What is the mean and variance of the number of wells that must be drilled if the oil company wants to set up three producing wells?
Solution
The mean number of wells is:
\[ \mu=E(X)=\dfrac{r}{p}=\dfrac{3}{0.20}=15 \]
with a variance of:
\[ \sigma^2=\mathrm{Var}(x)=\dfrac{r(1-p)}{p^2}=\dfrac{3(0.80)}{0.20^2}=60 \]