1.3 - Unbiased Estimation

On the previous page, we showed that if \(X_i\) are Bernoulli random variables with parameter \(p\), then:

\(\hat{p}=\dfrac{1}{n}\sum\limits_{i=1}^n X_i\)

is the maximum likelihood estimator of \(p\). And, if \(X_i\) are normally distributed random variables with mean \(\mu\) and variance \(\sigma^2\), then:

\(\hat{\mu}=\dfrac{\sum X_i}{n}=\bar{X}\) and \(\hat{\sigma}^2=\dfrac{\sum(X_i-\bar{X})^2}{n}\)

are the maximum likelihood estimators of \(\mu\) and \(\sigma^2\), respectively. A natural question then is whether or not these estimators are "good" in any sense. One measure of "good" is "unbiasedness."

Bias and Unbias Estimator

If the following holds:

\(E[u(X_1,X_2,\ldots,X_n)]=\theta\)

then the statistic \(u(X_1,X_2,\ldots,X_n)\) is an unbiased estimator of the parameter \(\theta\). Otherwise, \(u(X_1,X_2,\ldots,X_n)\) is a biased estimator of \(\theta\).

Example 1-4 Section

If \(X_i\) is a Bernoulli random variable with parameter \(p\), then:

\(\hat{p}=\dfrac{1}{n}\sum\limits_{i=1}^nX_i\)

is the maximum likelihood estimator (MLE) of \(p\). Is the MLE of \(p\) an unbiased estimator of \(p\)?

Answer

Recall that if \(X_i\) is a Bernoulli random variable with parameter \(p\), then \(E(X_i)=p\). Therefore:

\(E(\hat{p})=E\left(\dfrac{1}{n}\sum\limits_{i=1}^nX_i\right)=\dfrac{1}{n}\sum\limits_{i=1}^nE(X_i)=\dfrac{1}{n}\sum\limits_{i=1}^np=\dfrac{1}{n}(np)=p\)

The first equality holds because we've merely replaced \(\hat{p}\) with its definition. The second equality holds by the rules of expectation for a linear combination. The third equality holds because \(E(X_i)=p\). The fourth equality holds because when you add the value \(p\) up \(n\) times, you get \(np\). And, of course, the last equality is simple algebra.

In summary, we have shown that:

\(E(\hat{p})=p\)

Therefore, the maximum likelihood estimator is an unbiased estimator of \(p\).

Example 1-5 Section

If \(X_i\) are normally distributed random variables with mean \(\mu\) and variance \(\sigma^2\), then:

\(\hat{\mu}=\dfrac{\sum X_i}{n}=\bar{X}\) and \(\hat{\sigma}^2=\dfrac{\sum(X_i-\bar{X})^2}{n}\)

are the maximum likelihood estimators of \(\mu\) and \(\sigma^2\), respectively. Are the MLEs unbiased for their respective parameters?

Answer

Recall that if \(X_i\) is a normally distributed random variable with mean \(\mu\) and variance \(\sigma^2\), then \(E(X_i)=\mu\) and \(\text{Var}(X_i)=\sigma^2\). Therefore:

\(E(\bar{X})=E\left(\dfrac{1}{n}\sum\limits_{i=1}^nX_i\right)=\dfrac{1}{n}\sum\limits_{i=1}^nE(X_i)=\dfrac{1}{n}\sum\limits_{i=1}\mu=\dfrac{1}{n}(n\mu)=\mu\)

The first equality holds because we've merely replaced \(\bar{X}\) with its definition. Again, the second equality holds by the rules of expectation for a linear combination. The third equality holds because \(E(X_i)=\mu\). The fourth equality holds because when you add the value \(\mu\) up \(n\) times, you get \(n\mu\). And, of course, the last equality is simple algebra.

In summary, we have shown that:

\(E(\bar{X})=\mu\)

Therefore, the maximum likelihood estimator of \(\mu\) is unbiased. Now, let's check the maximum likelihood estimator of \(\sigma^2\). First, note that we can rewrite the formula for the MLE as:

\(\hat{\sigma}^2=\left(\dfrac{1}{n}\sum\limits_{i=1}^nX_i^2\right)-\bar{X}^2\)

because:

\(\displaystyle{\begin{aligned}
\hat{\sigma}^{2}=\frac{1}{n} \sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2} &=\frac{1}{n} \sum_{i=1}^{n}\left(x_{i}^{2}-2 x_{i} \bar{x}+\bar{x}^{2}\right) \\
&=\frac{1}{n} \sum_{i=1}^{n} x_{i}^{2}-2 \bar{x} \cdot \color{blue}\underbrace{\color{black}\frac{1}{n} \sum x_{i}}_{\bar{x}} \color{black} + \frac{1}{\color{blue}\cancel{\color{black} n}}\left(\color{blue}\cancel{\color{black}n} \color{black}\bar{x}^{2}\right) \\
&=\frac{1}{n} \sum_{i=1}^{n} x_{i}^{2}-\bar{x}^{2}
\end{aligned}}\)

Then, taking the expectation of the MLE, we get:

\(E(\hat{\sigma}^2)=\dfrac{(n-1)\sigma^2}{n}\)

as illustrated here:

\begin{align} E(\hat{\sigma}^2) &= E\left[\dfrac{1}{n}\sum\limits_{i=1}^nX_i^2-\bar{X}^2\right]=\left[\dfrac{1}{n}\sum\limits_{i=1}^nE(X_i^2)\right]-E(\bar{X}^2)\\ &= \dfrac{1}{n}\sum\limits_{i=1}^n(\sigma^2+\mu^2)-\left(\dfrac{\sigma^2}{n}+\mu^2\right)\\ &= \dfrac{1}{n}(n\sigma^2+n\mu^2)-\dfrac{\sigma^2}{n}-\mu^2\\ &= \sigma^2-\dfrac{\sigma^2}{n}=\dfrac{n\sigma^2-\sigma^2}{n}=\dfrac{(n-1)\sigma^2}{n}\\ \end{align}

The first equality holds from the rewritten form of the MLE. The second equality holds from the properties of expectation. The third equality holds from manipulating the alternative formulas for the variance, namely:

\(Var(X)=\sigma^2=E(X^2)-\mu^2\) and \(Var(\bar{X})=\dfrac{\sigma^2}{n}=E(\bar{X}^2)-\mu^2\)

The remaining equalities hold from simple algebraic manipulation. Now, because we have shown:

\(E(\hat{\sigma}^2) \neq \sigma^2\)

the maximum likelihood estimator of \(\sigma^2\) is a biased estimator.

Example 1-6 Section

If \(X_i\) are normally distributed random variables with mean \(\mu\) and variance \(\sigma^2\), what is an unbiased estimator of \(\sigma^2\)? Is \(S^2\) unbiased?

Answer

Recall that if \(X_i\) is a normally distributed random variable with mean \(\mu\) and variance \(\sigma^2\), then:

\(\dfrac{(n-1)S^2}{\sigma^2}\sim \chi^2_{n-1}\)

Also, recall that the expected value of a chi-square random variable is its degrees of freedom. That is, if:

\(X \sim \chi^2_{(r)}\)

then \(E(X)=r\). Therefore:

\(E(S^2)=E\left[\dfrac{\sigma^2}{n-1}\cdot \dfrac{(n-1)S^2}{\sigma^2}\right]=\dfrac{\sigma^2}{n-1} E\left[\dfrac{(n-1)S^2}{\sigma^2}\right]=\dfrac{\sigma^2}{n-1}\cdot (n-1)=\sigma^2\)

The first equality holds because we effectively multiplied the sample variance by 1. The second equality holds by the law of expectation that tells us we can pull a constant through the expectation. The third equality holds because of the two facts we recalled above. That is:

\(E\left[\dfrac{(n-1)S^2}{\sigma^2}\right]=n-1\)

And, the last equality is again simple algebra.

In summary, we have shown that, if \(X_i\) is a normally distributed random variable with mean \(\mu\) and variance \(\sigma^2\), then \(S^2\) is an unbiased estimator of \(\sigma^2\). It turns out, however, that \(S^2\) is always an unbiased estimator of \(\sigma^2\), that is, for any model, not just the normal model. (You'll be asked to show this in the homework.) And, although \(S^2\) is always an unbiased estimator of \(\sigma^2\), \(S\) is not an unbiased estimator of \(\sigma\). (You'll be asked to show this in the homework, too.)

Sometimes it is impossible to find maximum likelihood estimators in a convenient closed form. Instead, numerical methods must be used to maximize the likelihood function. In such cases, we might consider using an alternative method of finding estimators, such as the "method of moments." Let's go take a look at that method now.