Lesson 24: Sufficient Statistics

Overview Section

In the lesson on Point Estimation, we derived estimators of various parameters using two methods, namely, the method of maximum likelihood and the method of moments. The estimators resulting from these two methods are typically intuitive estimators. It makes sense, for example, that we would want to use the sample mean \(\bar{X}\) and sample variance \(S^2\) to estimate the mean \(\mu\) and variance \(\sigma^2\) of a normal population.

In the process of estimating such a parameter, we summarize, or reduce, the information in a sample of size \(n\), \(X_1, X_2,\ldots, X_n\), to a single number, such as the sample mean \(\bar{X}\). The actual sample values are no longer important to us. That is, if we use a sample mean of 3 to estimate the population mean \(\mu\), it doesn't matter if the original data values were (1, 3, 5) or (2, 3, 4). Has this process of reducing the \(n\) data points to a single number retained all of the information about \(\mu\) that was contained in the original \(n\) data points? Or has some information about the parameter been lost through the process of summarizing the data? In this lesson, we'll learn how to find statistics that summarize all of the information in a sample about the desired parameter. Such statistics are called sufficient statistics, and hence the name of this lesson.


Upon completion of this lesson, you should be able to:

  • To learn a formal definition of sufficiency.
  • To learn how to apply the Factorization Theorem to identify a sufficient statistic.
  • To learn how to apply the Exponential Criterion to identify a sufficient statistic.
  • To extend the definition of sufficiency for one parameter to two (or more) parameters.