24.2 - Expectations of Functions of Independent Random Variables

One of our primary goals of this lesson is to determine the theoretical mean and variance of the sample mean:

\(\bar{X}=\dfrac{X_1+X_2+\cdots+X_n}{n}\)

Now, assume the \(X_i\) are independent, as they should be if they come from a random sample. Then, finding the theoretical mean of the sample mean involves taking the expectation of a sum of independent random variables:

\(E(\bar{X})=\dfrac{1}{n} E(X_1+X_2+\cdots+X_n)\)

That's why we'll spend some time on this page learning how to take expectations of functions of independent random variables! A simple example illustrates that we already have a number of techniques sitting in our toolbox ready to help us find the expectation of a sum of independent random variables.

Example 24-1 Section

two pennies

Suppose we toss a penny three times. Let \(X_1\) denote the number of heads that we get in the three tosses. And, suppose we toss a second penny two times. Let \(X_2\) denote the number of heads we get in those two tosses. If we let:

\(Y=X_1+X_2\)

then \(Y\) denotes the number of heads in five tosses. Note that the random variables \(X_1\) and \(X_2\) are independent and therefore \(Y\) is the sum of independent random variables. Furthermore, we know that:

  • \(X_1\) is a binomial random variable with \(n=3\) and \(p=\frac{1}{2}\)
  • \(X_2\) is a binomial random variable with \(n=2\) and \(p=\frac{1}{2}\)
  • \(Y\) is a binomial random variable with \(n=5\) and \(p=\frac{1}{2}\)

What is the mean of \(Y\), the sum of two independent random variables? And, what is the variance of \(Y\)?

Solution

We can calculate the mean and variance of \(Y\) in three different ways.

  1. By recognizing that \(Y\) is a binomial random variable with \(n=5\) and \(p=\frac{1}{2}\), we can use what know about the mean and variance of a binomial random variable, namely that the mean of \(Y\) is:

    \(E(Y)=np=5(\frac{1}{2})=\frac{5}{2}\)

    and the variance of \(Y\) is:

    \(Var(Y)=np(1-p)=5(\frac{1}{2})(\frac{1}{2})=\frac{5}{4}\)

    Since sums of independent random variables are not always going to be binomial, this approach won't always work, of course. It would be good to have alternative methods in hand!

  2. We could use the linear operator property of expectation. Before doing so, it would be helpful to note that the mean of \(X_1\) is:

    \(E(X_1)=np=3(\frac{1}{2})=\frac{3}{2}\)

    and the mean of \(X_2\) is:

    \(E(X_2)=np=2(\frac{1}{2})=1\)

    Now, using the property, we get that the mean of \(Y\) is (thankfully) again \(\frac{5}{2}\):

    \(E(Y)=E(X_1+X_2)=E(X_1)+E(X_2)=\dfrac{3}{2}+1=\dfrac{5}{2}\)

    Recall that the second equality comes from the linear operator property of expectation. Now, using the linear operator property of expectation to find the variance of \(Y\) takes a bit more work. First, we should note that the variance of \(X_1\) is:

    \(Var(X_1)=np(1-p)=3(\frac{1}{2})(\frac{1}{2})=\frac{3}{4}\)

    and the variance of \(X_2\) is:

    \(Var(X_2)=np(1-p)=2(\frac{1}{2})(\frac{1}{2})=\frac{1}{2}\)

    Now, we can (thankfully) show again that the variance of \(Y\) is \(\frac{5}{4}\):

    Okay, as if two methods aren't enough, we still have one more method we could use.

  3. We could use the independence of the two random variables \(X_1\) and \(X_2\), in conjunction with the definition of expected value of \(Y\) as we know it. First, using the binomial formula, note that we can present the probability mass function of \(X_1\) in tabular form as:

    x1f(x1)01231/81/83/83/8

    And, we can present the probability mass function of \(X_2\) in tabular form as well:

    x2f(x2)0121/42/41/4

    Now, recall that if \(X_1\) and \(X_2\) are independent random variables, then:

    \(f(x_1,x_2)=f(x_1)\cdot f(x_2)\)

    We can use this result to help determine \(g(y)\), the probability mass function of \(Y\). First note that, since \(Y\) is the sum of \(X_1\) and \(X_2\), the support of \(Y\) is {0, 1, 2, 3, 4 and 5}. Now, by brute force, we get:

    \(g(0)=P(Y=0)=P(X_1=0,X_2=0)=f(0,0)=f_{X_1}(0) \cdot f_{X_2}(0)=\dfrac{1}{8} \cdot \dfrac{1}{4}=\dfrac{1}{32}\)

    The second equality comes from the fact that the only way that \(Y\) can equal 0 is if \(X_1=0\) and \(X_2=0\), and the fourth equality comes from the independence of \(X_1\)and \(X_2\). We can make a similar calculation to find the probability that \(Y=1\):

    \(g(1)=P(X_1=0,X_2=1)+P(X_1=1,X_2=0)=f_{X_1}(0) \cdot f_{X_2}(1)+f_{X_1}(1) \cdot f_{X_2}(0)=\dfrac{1}{8} \cdot \dfrac{2}{4}+\dfrac{3}{8} \cdot \dfrac{1}{4}=\dfrac{5}{32}\)

    The first equality comes from the fact that there are two (mutually exclusive) ways that \(Y\) can equal 1, namely if \(X_1=0\) and \(X_2=1\) or if \(X_1=1\) and \(X_2=0\). The second equality comes from the independence of \(X_1\) and \(X_2\). We can make similar calculations to find \(g(2), g(3), g(4)\), and \(g(5)\). Once we've done that, we can present the p.m.f. of \(Y\) in tabular form as:

    y=x1+x2g(y)0123451/3210/325/321/325/3210/32

    Then, it is a straightforward calculation to use the definition of the expected value of a discrete random variable to determine that (again!) the expected value of \(Y\) is \(\frac{5}{2}\):

    \(E(Y)=0(\frac{1}{32})+1(\frac{5}{32})+2(\frac{10}{32})+\cdots+5(\frac{1}{32})=\frac{80}{32}=\frac{5}{2}\)

    The variance of \(Y\) can be calculated similarly. (Do you want to calculate it one more time?!)

    The following summarizes the method we've used here in calculating the expected value of \(Y\):

    \begin{align} E(Y)=E(X_1+X_2) &= \sum\limits_{x_1 \in S_1}\sum\limits_{x_2 \in S_2} (x_1+x_2)f(x_1,x_2)\\ &= \sum\limits_{x_1 \in S_1}\sum\limits_{x_2 \in S_2} (x_1+x_2)f(x_1) f(x_2)\\ &= \sum\limits_{y \in S} yg(y)\\ \end{align}

    The first equality comes, of course, from the definition of \(Y\). The second equality comes from the definition of the expectation of a function of discrete random variables. The third equality comes from the independence of the random variables \(X_1\) and \(X_2\). And, the fourth equality comes from the definition of the expected value of \(Y\), as well as the fact that \(g(y)\) can be determined by summing the appropriate joint probabilities of \(X_1\) and \(X_2\).

The following theorem formally states the third method we used in determining the expected value of \(Y\), the function of two independent random variables. We state the theorem without proof. (If you're interested, you can find a proof of it in Hogg, McKean and Craig, 2005.)

Theorem

Let \(X_1, X_2, \ldots, X_n\) be \(n\) independent random variables that, by their independence, have the joint probability mass function:

\(f_1(x_1)f_2(x_2)\cdots f_n(x_n)\)

Let the random variable \(Y=u(X_1,X_2, \ldots, X_n)\) have the probability mass function \(g(y)\). Then, in the discrete case:

\(E(Y)=\sum\limits_y yg(y)=\sum\limits_{x_1}\sum\limits_{x_2}\cdots\sum\limits_{x_n}u(x_1,x_2,\ldots,x_n) f_1(x_1)f_2(x_2)\cdots f_n(x_n)\)

provided that these summations exist. For continuous random variables, integrals replace the summations.

In the special case that we are looking for the expectation of the product of functions of \(n\) independent random variables, the following theorem will help us out.

Theorem
If \(X_1, X_2, \ldots, X_n\) are independent random variables and, for \(i=1, 2, \ldots, n\), the expectation \(E[u_i(X_i)]\) exists, then:

\(E[u_1(x_1)u_2(x_2)\cdots u_n(x_n)]=E[u_1(x_1)]E[u_2(x_2)]\cdots E[u_n(x_n)]\)

That is, the expectation of the product is the product of the expectations.

Proof

For the sake of concreteness, let's assume that the random variables are discrete. Then, the definition of expectation gives us:

\(E[u_1(x_1)u_2(x_2)\cdots u_n(x_n)]=\sum\limits_{x_1}\sum\limits_{x_2}\cdots \sum\limits_{x_n} u_1(x_1)u_2(x_2)\cdots u_n(x_n) f_1(x_1)f_2(x_2)\cdots f_n(x_n)\)

Then, since functions that don't depend on the index of the summation signs can get pulled through the summation signs, we have:

\(E[u_1(x_1)u_2(x_2)\cdots u_n(x_n)]=\sum\limits_{x_1}u_1(x_1)f_1(x_1) \sum\limits_{x_2}u_2(x_2)f_2(x_2)\cdots \sum\limits_{x_n}u_n(x_n)f_n(x_n)\)

Then, by the definition, in the discrete case, of the expected value of \(u_i(X_i)\), our expectation reduces to:

\(E[u_1(x_1)u_2(x_2)\cdots u_n(x_n)]=E[u_1(x_1)]E[u_2(x_2)]\cdots E[u_n(x_n)]\)

Our proof is complete. If our random variables are instead continuous, the proof would be similar. We would just need to make the obvious change of replacing the summation signs with integrals.

Let's return to our example in which we toss a penny three times, and let \(X_1\) denote the number of heads that we get in the three tosses. And, again toss a second penny two times, and let \(X_2\) denote the number of heads we get in those two tosses. In our previous work, we learned that:

  • \(E(X_1)=\frac{3}{2}\) and \(\text{Var}(X_1)=\frac{3}{4}\)
  • \(E(X_2)=1\) and \(\text{Var}(X_2)=\frac{1}{2}\)

What is the expected value of \(X_1^2X_2\)?

Solution

We'll use the fact that the expectation of the product is the product of the expectations: