7.3 - Estimator for Cluster Sampling when Primary units are selected by p.p.s

The primary units selected with probabilities proportional to size:

\(p_i=M_i/M\)

The Hansen-Hurwitz (p.p.s.) estimator is:

\(\hat{\tau}_p=\dfrac{M}{n}\sum\limits_{i=1}^n \left(\dfrac{y_i}{M_i}\right)\)

Denote by \(\bar{y}_i=\dfrac{y_i}{M_i}\)

\(\hat{V}ar(\hat{\tau}_p)=\dfrac{M^2}{n(n-1)}\sum\limits_{i=1}^n (\bar{y}_i-\hat{\mu}_p)^2\) where

\(\hat{\mu}_p=\dfrac{\hat{\tau}_p}{M}\) is unbiased for \(\mu\).

Thus we also see that:

\(\hat{V}ar(\hat{\mu}_p)=\dfrac{1}{n(n-1)}\sum\limits_{i=1}^n (\bar{y}_i-\hat{\mu}_p)^2\)

Example: Estimating population mean per secondary unit when primary units are selected by pps

From the "Total number of computer help requests" example in Lesson 3.1, 3 clusters out of 10 clusters are sampled (n = 3) with replacement. The data are:

\(y_1=420, y_2 = 1785, y_3=2198\)

\(M_1=650, M_2=2840, M_3=3200\)

Try it!

Find the Hansen-Hurwitz estimator for the population mean and also find the variance of the estimator.

 \begin{align}
\hat{\mu}_p &= \dfrac{1}{n} \sum\limits_{i=1}^n \dfrac{y_i}{M_i}\\
&= \dfrac{1}{3}\times \left(\dfrac{420}{650}+\dfrac{1785}{2840}+\dfrac{2198}{3200}\right)\\
&= 0.6538\\
\end{align}

\begin{align}
\hat{V}ar(\hat{\mu}_p)&=\dfrac{1}{n(n-1)}\sum\limits_{i=1}^n (\bar{y}_i-\hat{\mu}_p)^2\\
&= \dfrac{1}{3 \times 2}[(0.6462-0.6538)^2+(0.6285-0.6538)^2+(0.6869-0.6538)^2]\\
&= 0.000299\\
\end{align}

Note! For an example to review and estimate the population total, refer to earlier lecture notes on the Hansen-Hurwitz estimator and the probabilities proportional to size as they were referred to in the Palm Tree total estimator examples.