# 7.3 - Estimator for Cluster Sampling when Primary units are selected by p.p.s

7.3 - Estimator for Cluster Sampling when Primary units are selected by p.p.s

The primary units selected with probabilities proportional to size:

$$p_i=M_i/M$$

The Hansen-Hurwitz (p.p.s.) estimator is:

$$\hat{\tau}_p=\dfrac{M}{n}\sum\limits_{i=1}^n \left(\dfrac{y_i}{M_i}\right)$$

Denote by $$\bar{y}_i=\dfrac{y_i}{M_i}$$

$$\hat{V}ar(\hat{\tau}_p)=\dfrac{M^2}{n(n-1)}\sum\limits_{i=1}^n (\bar{y}_i-\hat{\mu}_p)^2$$ where

$$\hat{\mu}_p=\dfrac{\hat{\tau}_p}{M}$$ is unbiased for $$\mu$$.

Thus we also see that:

$$\hat{V}ar(\hat{\mu}_p)=\dfrac{1}{n(n-1)}\sum\limits_{i=1}^n (\bar{y}_i-\hat{\mu}_p)^2$$

Example: Estimating population mean per secondary unit when primary units are selected by pps

From the "Total number of computer help requests" example in Lesson 3.1, 3 clusters out of 10 clusters are sampled (n = 3) with replacement. The data are:

$$y_1=420, y_2 = 1785, y_3=2198$$

$$M_1=650, M_2=2840, M_3=3200$$

#### Try it!

Find the Hansen-Hurwitz estimator for the population mean and also find the variance of the estimator.

\begin{align}
\hat{\mu}_p &= \dfrac{1}{n} \sum\limits_{i=1}^n \dfrac{y_i}{M_i}\\
&= \dfrac{1}{3}\times \left(\dfrac{420}{650}+\dfrac{1785}{2840}+\dfrac{2198}{3200}\right)\\
&= 0.6538\\
\end{align}

\begin{align}
\hat{V}ar(\hat{\mu}_p)&=\dfrac{1}{n(n-1)}\sum\limits_{i=1}^n (\bar{y}_i-\hat{\mu}_p)^2\\
&= \dfrac{1}{3 \times 2}[(0.6462-0.6538)^2+(0.6285-0.6538)^2+(0.6869-0.6538)^2]\\
&= 0.000299\\
\end{align}

Note! For an example to review an estimate the population total, refer to earlier lecture notes on the Hansen-Hurwitz estimator and the probabilities proportional to size as they were referred to in the Palm Tree total estimator examples.

 [1] Link ↥ Has Tooltip/Popover Toggleable Visibility