Lesson 9: Moment Generating Functions

Overview

The expected values \(E(X), E(X^2), E(X^3), \ldots, \text{and } E(X^r)\) are called moments. As you have already experienced in some cases, the mean:

\(\mu=E(X)\)

and the variance:

\(\sigma^2=\text{Var}(X)=E(X^2)-\mu^2\)

which are functions of moments, are sometimes difficult to find. Special functions, called moment-generating functions can sometimes make finding the mean and variance of a random variable simpler.

In this lesson, we'll first learn what a moment-generating function is, and then we'll earn how to use moment generating functions (abbreviated "m.g.f."):

to find moments and functions of moments, such as \(\mu\) and \(\sigma^2\)
to identify which probability mass function a random variable \(X\) follows

Objectives

Upon completion of this lesson, you should be able to:

To learn the definition of a moment-generating function.
To find the moment-generating function of a binomial random variable.
To learn how to use a moment-generating function to find the mean and variance of a random variable.
To learn how to use a moment-generating function to identify which probability mass function a random variable \(X\) follows.
To understand the steps involved in each of the proofs in the lesson.
To be able to apply the methods learned in the lesson to new problems.

9.1 - What is an MGF?

Moment generating function of \(X\)

Let \(X\) be a discrete random variable with probability mass function \(f(x)\) and support \(S\). Then:

\(M(t)=E(e^{tX})=\sum\limits_{x\in S} e^{tx}f(x)\)

is the moment generating function of \(X\) as long as the summation is finite for some interval of \(t\) around 0. That is, \(M(t)\) is the moment generating function ("m.g.f.") of \(X\) if there is a positive number \(h\) such that the above summation exists and is finite for \(-h<t<h\).

Example 9-1

What is the moment generating function of a binomial random variable \(X\)?

Once we find the moment generating function of a random variable, we can use it to... ta-da!... generate moments!

9.2 - Finding Moments

Proposition

If a moment-generating function exists for a random variable \(X\), then:

The mean of \(X\) can be found by evaluating the first derivative of the moment-generating function at \(t=0\). That is:

\(\mu=E(X)=M'(0)\)
The variance of \(X\) can be found by evaluating the first and second derivatives of the moment-generating function at \(t=0\). That is:

\(\sigma^2=E(X^2)-[E(X)]^2=M''(0)-[M'(0)]^2\)

Before we prove the above proposition, recall that \(E(X), E(X^2), \ldots, E(X^r)\) are called moments about the origin. It is for this reason, and the above proposition, that the function \(M(t)\) is called a moment-generating function. That is, \(M(t)\) generates moments! The proposition actually doesn't tell the whole story. In fact, in general the \(r^{th}\) moment about the origin can be found by evaluating the \(r^{th}\) derivative of the moment-generating function at \(t=0\). That is:

\(M^{(r)}(0)=E(X^r)\)

Now, let's prove the proposition.

Proof

We begin the proof by recalling that the moment-generating function is defined as follows:

\(M(t)=E(e^{tX})=\sum\limits_{x\in S} e^{tx} f(x)\)

And, by definition, \(M(t)\) is finite on some interval of \(t\) around 0. That tells us two things:

Derivatives of all orders exist at \(t=0\).
It is okay to interchange differentiation and summation.

That said, we can now work on the gory details of the proof:

Example 9-2

Use the moment-generating function for a binomial random variable \(X\):

\(M(t)=[(1-p)+p e^t]^n\)

to find the mean \(\mu\) and variance \(\sigma^2\) of a binomial random variable.

Solution

Keeping in mind that we need to take the first derivative of \(M(t)\) with respect to \(t\), we get:

\(M'(t)=n[1-p+pe^t]^{n-1} (pe^t)\)

And, setting \(t=0\), we get the binomial mean \(\mu=np\):

To find the variance, we first need to take the second derivative of \(M(t)\) with respect to \(t\). Doing so, we get:

\(M''(t)=n[1-p+pe^t]^{n-1} (pe^t)+(pe^t) n(n-1)[1-p+pe^t]^{n-2} (pe^t)\)

And, setting \(t=0\), and using the formula for the variance, we get the binomial variance \(\sigma^2=np(1-p)\):

Not only can a moment-generating function be used to find moments of a random variable, it can also be used to identify which probability mass function a random variable follows.

9.3 - Finding Distributions

Proposition

A moment-generating function uniquely determines the probability distribution of a random variable.

Proof

If the support \(S\) is \(\{b_1, b_2, b_3, \ldots\}\), then the moment-generating function:

\(M(t)=E(e^{tX})=\sum\limits_{x\in S} e^{tx} f(x)\)

is given by:

\(M(t)=e^{tb_1}f(b_1)+e^{tb_2}f(b_2)+e^{tb_3}f(b_3)+\cdots\)

Therefore, the coefficient of:

\(e^{tb_i}\)

is the probability:

\(f(b_i)=P(X=b_i)\)

This implies necessarily that if two random variables have the same moment-generating function, then they must have the same probability distribution.

Example 9-3

If a random variable \(X\) has the following moment-generating function:

\(M(t)=\left(\dfrac{3}{4}+\dfrac{1}{4}e^t\right)^{20}\)

for all \(t\), then what is the p.m.f. of \(X\)?

Solution

We previously determined that the moment generating function of a binomial random variable is:

\(M(t)=[(1-p)+p e^t]^n\)

for \(-\infty<t<\infty\). Comparing the given moment generating function with that of a binomial random variable, we see that \(X\) must be a binomial random variable with \(n = 20\) and \(p=\frac{1}{4}\). Therefore, the p.m.f. of \(X\) is:

\(f(x)=\dbinom{20}{x} \left(\dfrac{1}{4}\right)^x \left(\dfrac{3}{4}\right)^ {20-x}\)

for \(x=0, 1, \ldots, 20\).

Example 9-4

If a random variable \(X\) has the following moment-generating function:

\(M(t)=\dfrac{1}{10}e^t+\dfrac{2}{10}e^{2t} + \dfrac{3}{10}e^{3t}+ \dfrac{4}{10}e^{4t}\)

for all \(t\), then what is the p.m.f. of \(X\)?

9.4 - Moment Generating Functions

Moment generating functions (mgfs) are function of \(t\). You can find the mgfs by using the definition of expectation of function of a random variable. The moment generating function of \(X\) is

\(M_X(t)=E\left[e^{tX}\right]=E\left[\text{exp}(tX)\right] \)

Note that \(\exp(X)\) is another way of writing \(e^X\).

Besides helping to find moments, the moment generating function has an important property often called the uniqueness property. The uniqueness property means that, if the mgf exists for a random variable, then there one and only one distribution associated with that mgf. Therefore, the mgf uniquely determines the distribution of a random variable.

This property of the mgf is sometimes referred to as the uniqueness property of the mgf.

Suppose we have the following mgf for a random variable \(Y\)

\(M_Y(t)=\dfrac{e^t}{4-3e^t}, \;\; t<-\ln(0.75)\)

Using the information in this section, we can find the \(E(Y^k)\) for any \(k\) if the expectation exists. Lets find \(E(Y)\) and \(E(Y^2)\).

We can solve these in a couple of ways.

We can use the knowledge that \(M^\prime(0)=E(Y)\) and \(M^{\prime\prime}(0)=E(Y^2)\). Then we can find variance by using \(Var(Y)=E(Y^2)-E(Y)^2\). This is left as an exercise below.
We can recognize that this is a moment generating function for a Geometric random variable with \(p=\frac{1}{4}\). It is also a Negative Binomial random variable with \(r=1\) and \(p=\frac{1}{4}\). Since it is a negative binomial random variable, we know \(E(Y)=\mu=\frac{r}{p}=\frac{1}{\frac{1}{4}}=4\) and \(Var(Y)=\frac{r(1-p)}{p^2}=12\). We can use the formula \(Var(Y)=E(Y^2)-E(Y)^2\) to find \(E(Y^2)\) by

\(E(Y^2)=Var(Y)+E(Y)^2=12+(4)^2=12+16=28\)

Additional Practice Problems

Let \(X\) be a binomial random variable with parameters \(n\) and \(p\). What value of \(p\) maximizes \(P(X=k)\) for \(k=0, 1, \ldots, n\)? This is an example of a statistical method used to estimate \(p\) when a binomial random variable is equal to \(k\). If we assume that \(n\) is known, then we estimate \(p\) by choosing the value of \(p\) that maximizes \(f_X(k)=P(X=k)\). This is known as the method of maximum likelihood estimates. Maximum likelihood estimates are discussed in more detail in STAT 415. When we are trying to find the maximum with respect to \(p\) it often helps to find the maximum of the natural log of \(f_X(k)\).
NOTE! Statisticians use the notation of \(\log\) when we are referring to \(\ln\) or \(\log_e\).
\begin{align} P(X=x)&=f_X(x)={n\choose x}p^x(1-p)^{n-x}\\ \ln f_X(x)&=\ln {n\choose x}+x\ln p +(n-x)\ln (1-p)\\ \ell&=\frac{\partial \ln f_X(x)}{\partial p}=\frac{x}{p}-\frac{n-p}{1-p}=\frac{(1-p)x-p(n-x)}{p(1-p)}\\ \qquad \Rightarrow 0&=\frac{(1-p)x-p(n-x)}{p(1-p)},\Rightarrow 0=x(1-p)-p(n-x)=x-xp-np+xp=x-np\\ \qquad \Rightarrow x&=np, \Rightarrow \hat{p}=\frac{x}{n}\end{align}
We use \(\hat{p}\) to denote the estimate of \(p\). This estimate makes sense. If \(X\) is the number of successes out of \(n\) trials, then a good estimate of \(p=P(\text{success})\) would be the number of successes out of the total number of trials.
Suppose that \(Y\) has the following mgf.
\(M_Y(t)=\dfrac{e^t}{4-3e^t}, \;\; t<-\ln(0.75)\)
1. Find \(E(Y)\).
  \(\begin{array}{l}M^{\prime}(t)=e^t\left(4-3e^t\right)^{-1}+3e^{2t}\left(4-3e^t\right)^{-2}\\ E(Y)=M^{\prime}(0)=1+3=4\end{array}\)
2. Find \(E(Y^2)\).
  \(\begin{array}{1}M^{\prime\prime}(t)=e^t(4-3e^t)^{-1}+3e^{2t}(4-3e^t)^{-2}+6e^{2t}(4-3e^t)^{-2}+18e^{3t}(4-3e^t)^{-3}\\ E(Y^2)=M^{\prime\prime}(0)=1+3+6+18=28\end {array}\)

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility