Factorization Theorem
While the definition of sufficiency provided on the previous page may make sense intuitively, it is not always all that easy to find the conditional distribution of X_{1}, X_{2}, ..., X_{n} given Y. Not to mention that we'd have to find the conditional distribution of X_{1}, X_{2}, ..., X_{n} given Y for every Y that we'd want to consider a possible sufficient statistic! Therefore, using the formal definition of sufficiency as a way of identifying a sufficient statistic for a parameter θ can often be a daunting road to follow. Thankfully, a theorem often referred to as the Factorization Theorem provides an easier alternative! We state it here without proof.
Factorization Theorem. Let X_{1}, X_{2}, ..., X_{n} denote random variables with joint probability density function or joint probability mass function f(x_{1}, x_{2}, ..., x_{n}; θ), which depends on the parameter θ. Then, the statistic \(Y = u(X_1, X_2, ... , X_n) \) is sufficient for θ if and only if the p.d.f (or p.m.f.) can be factored into two components, that is: \[f(x_1, x_2, ... , x_n;\theta) = \phi [ u(x_1, x_2, ... , x_n);\theta ] h(x_1, x_2, ... , x_n) \] where:

Let's put the theorem to work on a few examples!
Example
Let X_{1}, X_{2}, ..., X_{n} denote a random sample from a Poisson distribution with parameter λ > 0. Find a sufficient statistic for the parameter λ.
Solution. Because X_{1}, X_{2}, ..., X_{n} is a random sample, the joint probability mass function of X_{1}, X_{2}, ..., X_{n} is, by independence:
\[f(x_1, x_2, ... , x_n;\lambda) = f(x_1;\lambda) \times f(x_2;\lambda) \times ... \times f(x_n;\lambda)\]
Inserting what we know to be the probability mass function of a Poisson random variable with parameter λ, the joint p.m.f. is therefore:
\[f(x_1, x_2, ... , x_n;\lambda) = \frac{e^{\lambda}\lambda^{x_1}}{x_1!} \times\frac{e^{\lambda}\lambda^{x_2}}{x_2!} \times ... \times \frac{e^{\lambda}\lambda^{x_n}}{x_n!}\]
Now, simpliyfing, by adding up all n of the λs in the exponents, as well as all n of the x_{i}'s in the exponents, we get:
\[f(x_1, x_2, ... , x_n;\lambda) = \left(e^{n\lambda}\lambda^{\Sigma x_i} \right) \times \left( \frac{1}{x_1! x_2! ... x_n!} \right)\]
Hey, look at that! We just factored the joint p.m.f. into two functions, one (φ) being only a function of the statistic \(Y=\sum_{i=1}^{n}X_i\) and the other (h) not depending on the parameter λ:
Therefore, the Factorization Theorem tells us that \(Y=\sum_{i=1}^{n}X_i\) is a sufficient statistic for λ. But, wait a second! We can also write the joint p.m.f. as:
\[f(x_1, x_2, ... , x_n;\lambda) = \left(e^{n\lambda}\lambda^{n\bar{x}} \right) \times \left( \frac{1}{x_1! x_2! ... x_n!} \right)\]
Therefore, the Factorization Theorem tells us that \(Y = \bar{X}\) is also a sufficient statistic for λ!
If you think about it, it makes sense that \(Y = \bar{X}\) and \(Y=\sum_{i=1}^{n}X_i\) are both sufficient statistics, because if we know \(Y = \bar{X}\), we can easily find \(Y=\sum_{i=1}^{n}X_i\). And, if we know \(Y=\sum_{i=1}^{n}X_i\), we can easily find \(Y = \bar{X}\).
The previous example suggests that there can be more than one sufficient statistic for a parameter θ. In general, if Y is a sufficient statistic for a parameter θ, then every onetoone function of Y not involving θ is also a sufficient statistic for θ. Let's take a look at another example.
Example
Let X_{1}, X_{2}, ..., X_{n} be a random sample from a normal distribution with mean μ and variance 1. Find a sufficient statistic for the parameter μ.
Solution. Because X_{1}, X_{2}, ..., X_{n} is a random sample, the joint probability density function of X_{1}, X_{2}, ..., X_{n} is, by independence:
\[f(x_1, x_2, ... , x_n;\mu) = f(x_1;\mu) \times f(x_2;\mu) \times ... \times f(x_n;\mu)\]
Inserting what we know to be the probability density function of a normal random variable with mean μ and variance 1, the joint p.d.f. is:
\[f(x_1, x_2, ... , x_n;\mu) = \frac{1}{(2\pi)^{1/2}} exp \left[ \frac{1}{2}(x_1  \mu)^2 \right] \times \frac{1}{(2\pi)^{1/2}} exp \left[ \frac{1}{2}(x_2  \mu)^2 \right] \times ... \times \frac{1}{(2\pi)^{1/2}} exp \left[ \frac{1}{2}(x_n  \mu)^2 \right] \]
Collecting like terms, we get:
\[f(x_1, x_2, ... , x_n;\mu) = \frac{1}{(2\pi)^{n/2}} exp \left[ \frac{1}{2}\sum_{i=1}^{n}(x_i  \mu)^2 \right]\]
A trick to making the factoring of the joint p.d.f. an easier task is to add 0 to the quantity in parentheses in the summation. That is:
Now, squaring the quantity in parentheses, we get:
\[f(x_1, x_2, ... , x_n;\mu) = \frac{1}{(2\pi)^{n/2}} exp \left[ \frac{1}{2}\sum_{i=1}^{n}\left[ (x_i  \bar{x})^2 +2(x_i  \bar{x}) (\bar{x}\mu)+ (\bar{x}\mu)^2\right] \right]\]
And then distributing the summation, we get:
\[f(x_1, x_2, ... , x_n;\mu) = \frac{1}{(2\pi)^{n/2}} exp \left[ \frac{1}{2}\sum_{i=1}^{n} (x_i  \bar{x})^2  (\bar{x}\mu) \sum_{i=1}^{n}(x_i  \bar{x}) \frac{1}{2}\sum_{i=1}^{n}(\bar{x}\mu)^2\right] \]
But, the middle term in the exponent is 0, and the last term, because it doesn't depend on the index i, can be added up n times:
So, simplifying, we get:
\[f(x_1, x_2, ... , x_n;\mu) = \left\{ exp \left[ \frac{n}{2} (\bar{x}\mu)^2 \right] \right\} \times \left\{ \frac{1}{(2\pi)^{n/2}} exp \left[ \frac{1}{2}\sum_{i=1}^{n} (x_i  \bar{x})^2 \right] \right\} \]
In summary, we have factored the joint p.d.f. into two functions, one (φ) being only a function of the statistic \(Y = \bar{X}\) and the other (h) not depending on the parameter μ:
Therefore, the Factorization Theorem tells us that \(Y = \bar{X}\) is a sufficient statistic for μ. Now, \(Y = \bar{X}^3\) is also sufficient for μ, because if we are given the value of \( \bar{X}^3\), we can easily get the value of \(\bar{X}\) through the onetoone function \(w=y^{1/3}\). That is:
\[ W=(\bar{X}^3)^{1/3}=\bar{X} \]
On the other hand, \(Y = \bar{X}^2\) is not a sufficient statistic for μ, because it is not a onetoone function. That is, if we are given the value of \(\bar{X}^2\), using the inverse function:
\[w=y^{1/2}\]
we get two possible values, namely:
\(\bar{X}\) and \(+\bar{X}\)
We're getting so good at this, let's take a look at one more example!
Example
Let X_{1}, X_{2}, ..., X_{n} be a random sample from an exponential distribution with parameter θ. Find a sufficient statistic for the parameter θ.
Solution. Because X_{1}, X_{2}, ..., X_{n} is a random sample, the joint probability density function of X_{1}, X_{2}, ..., X_{n} is, by independence:
\[f(x_1, x_2, ... , x_n;\theta) = f(x_1;\theta) \times f(x_2;\theta) \times ... \times f(x_n;\theta)\]
Inserting what we know to be the probability density function of an exponential random variable with parameter θ, the joint p.d.f. is:
\[f(x_1, x_2, ... , x_n;\theta) =\frac{1}{\theta}exp\left( \frac{x_1}{\theta}\right) \times \frac{1}{\theta}exp\left( \frac{x_2}{\theta}\right) \times ... \times \frac{1}{\theta}exp\left( \frac{x_n}{\theta} \right) \]
Now, simpliyfing, by adding up all n of the θs and the n x_{i}'s in the exponents, we get:
\[f(x_1, x_2, ... , x_n;\theta) =\frac{1}{\theta^n}exp\left(  \frac{1}{\theta} \sum_{i=1}^{n} x_i\right) \]
We have again factored the joint p.d.f. into two functions, one (φ) being only a function of the statistic \(Y=\sum_{i=1}^{n}X_i\) and the other (h) not depending on the parameter θ:
Therefore, the Factorization Theorem tells us that \(Y=\sum_{i=1}^{n}X_i\) is a sufficient statistic for θ. And, since \(Y = \bar{X}\) is a onetoone function of \(Y=\sum_{i=1}^{n}X_i\), it implies that \(Y = \bar{X}\) is also a sufficient statistic for θ.