19.3 - Conditional Means and Variances

Now that we've mastered the concept of a conditional probability mass function, we'll now turn our attention to finding conditional means and variances. We'll start by giving formal definitions of the conditional mean and conditional variance when \(X\) and \(Y\) are discrete random variables. And then we'll end by actually calculating a few!

Definition. Suppose \(X\) and \(Y\) are discrete random variables. Then, the conditional mean of \(Y\) given \(X=x\) is defined as:

\(\mu_{Y|X}=E[Y|x]=\sum\limits_y yh(y|x)\)

And, the conditional mean of \(X\) given \(Y=y\) is defined as:

\(\mu_{X|Y}=E[X|y]=\sum\limits_x xg(x|y)\)

The conditional variance of \(Y\) given \(X=x\) is:

\(\sigma^2_{Y|x}=E\{[Y-\mu_{Y|x}]^2|x\}=\sum\limits_y [y-\mu_{Y|x}]^2 h(y|x)\)

or, alternatively, using the usual shortcut:

\(\sigma^2_{Y|x}=E[Y^2|x]-\mu^2_{Y|x}=\left[\sum\limits_y y^2 h(y|x)\right]-\mu^2_{Y|x}\)

And, the conditional variance of \(X\) given \(Y=y\) is:

\(\sigma^2_{X|y}=E\{[X-\mu_{X|y}]^2|y\}=\sum\limits_x [x-\mu_{X|y}]^2 g(x|y)\)

or, alternatively, using the usual shortcut:

\(\sigma^2_{X|y}=E[X^2|y]-\mu^2_{X|y}=\left[\sum\limits_x x^2 g(x|y)\right]-\mu^2_{X|y}\)

As you can see by the formulas, a conditional mean is calculated much like a mean is, except you replace the probability mass function with a conditional probability mass function. And, a conditional variance is calculated much like a variance is, except you replace the probability mass function with a conditional probability mass function. Let's return to one of our examples to get practice calculating a few of these guys.

Example 19-3 Section

Let \(X\) be a discrete random variable with support \(S_1=\{0,1\}\), and let \(Y\) be a discrete random variable with support \(S_2=\{0, 1, 2\}\). Suppose, in tabular form, that \(X\) and \(Y\) have the following joint probability distribution \(f(x,y)\):

f (x,y)X012Y0111/81/82/82/81/84/84/81/83/82/83/8

What is the conditional mean of \(Y\) given \(X=x\)?

Solution

We previously determined that the conditional distribution of \(Y\) given \(X\) is:

012Y01X1/41/42/42/41/41/411

Therefore, we can use it, that is, \(h(y|x)\), and the formula for the conditional mean of \(Y\) given \(X=x\) to calculate the conditional mean of \(Y\) given \(X=0\). It is:

\(\mu_{Y|0}=E[Y|0]=\sum\limits_y yh(y|0)=0\left(\dfrac{1}{4}\right)+1\left(\dfrac{2}{4}\right)+2\left(\dfrac{1}{4}\right)=1\)

And, we can use \(h(y|x)\) and the formula for the conditional mean of \(Y\) given \(X=x\) to calculate the conditional mean of \(Y\) given \(X=1\). It is:

\(\mu_{Y|1}=E[Y|1]=\sum\limits_y yh(y|1)=0\left(\dfrac{2}{4}\right)+1\left(\dfrac{1}{4}\right)+2\left(\dfrac{1}{4}\right)=\dfrac{3}{4}\)

Note that the conditional mean of \(Y|X=x\) depends on \(x\), and depends on \(x\) alone. You might want to think about these conditional means in terms of sub-populations again. The mean of \(Y\) is likely to depend on the sub-population, as it does here. The mean of \(Y\) is 1 for the \(X=0\) sub-population, and the mean of \(Y\) is \(\frac{3}{4}\) for the \(X=1\) sub-population. Intuitively, this dependence should make sense. Rather than calculating the average weight of an adult, for example, you would probably want to calculate the average weight for the sub-population of females and the average weight for the sub-population of males, because the average weight no doubt depends on the sub-population!

What is the conditional mean of \(X\) given \(Y=y\)?

Solution

We previously determined that the conditional distribution of \(X\) given \(Y\) is:

X012Y011/31/22/32/31/21/3111

As the conditional distribution of \(X\) given \(Y\) suggests, there are three sub-populations here, namely the \(Y=0\) sub-population, the \(Y=1\) sub-population and the \(Y=2\) sub-population. Therefore, we have three conditional means to calculate, one for each sub-population. Now, we can use \(g(x|y)\) and the formula for the conditional mean of \(X\) given \(Y=y\) to calculate the conditional mean of \(X\) given \(Y=0\). It is:

\(\mu_{X|0}=E[X|0]=\sum\limits_x xg(x|0)=0\left(\dfrac{1}{3}\right)+1\left(\dfrac{2}{3}\right)=\dfrac{2}{3}\)

And, we can use \(g(x|y)\) and the formula for the conditional mean of \(X\) given \(Y=y\) to calculate the conditional mean of \(X\) given \(Y=1\). It is:

\(\mu_{X|1}=E[X|1]=\sum\limits_x xg(x|1)=0\left(\dfrac{2}{3}\right)+1\left(\dfrac{1}{3}\right)=\dfrac{1}{3}\)

And, we can use \(g(x|y)\) and the formula for the conditional mean of \(X\) given \(Y=y\) to calculate the conditional mean of \(X\) given \(Y=2\). It is:

\(\mu_{X|2}=E[X|2]=\sum\limits_x xg(x|2)=0\left(\dfrac{1}{2}\right)+1\left(\dfrac{1}{2}\right)=\dfrac{1}{2}\)

Note that the conditional mean of \(X|Y=y\) depends on \(y\), and depends on \(y\) alone. The mean of \(X\) is \(\frac{2}{3}\) for the \(Y=0\) sub-population, the mean of \(X\) is \(\frac{1}{3}\) for the \(Y=1\) sub-population, and the mean of \(X\) is \(\frac{1}{2}\) for the \(Y=2\) sub-population.

What is the conditional variance of \(Y\) given \(X=0\)?

Solution

We previously determined that the conditional distribution of \(Y\) given \(X\) is:

012Y01X1/41/42/42/41/41/411

Therefore, we can use it, that is, \(h(y|x)\), and the formula for the conditional variance of \(X\) given \(X=x\) to calculate the conditional variance of \(X\) given \(X=0\). It is:

\begin{align} \sigma^2_{Y|0} &= E\{[Y-\mu_{Y|0}]^2|x\}=E\{[Y-1]^2|0\}=\sum\limits_y (y-1)^2 h(y|0)\\ &= (0-1)^2 \left(\dfrac{1}{4}\right)+(1-1)^2 \left(\dfrac{2}{4}\right)+(2-1)^2 \left(\dfrac{1}{4}\right)=\dfrac{1}{4}+0+\dfrac{1}{4}=\dfrac{2}{4} \end{align}

We could have alternatively used the shortcut formula. Doing so, we better get the same answer:

\begin{align} \sigma^2_{Y|0} &= E[Y^2|0]-\mu_{Y|0}]^2=\left[\sum\limits_y y^2 h(y|0)\right]-1^2\\ &= \left[(0)^2\left(\dfrac{1}{4}\right)+(1)^2\left(\dfrac{2}{4}\right)+(2)^2\left(\dfrac{1}{4}\right)\right]-1\\ &= \left[0+\dfrac{2}{4}+\dfrac{4}{4}\right]-1=\dfrac{2}{4} \end{align}

And we do! That is, no matter how we choose to calculate it, we get that the variance of \(Y\) is \(\frac{1}{2}\) for the \(X=0\) sub-population.