# 7.3 - Estimator for Cluster Sampling when Primary units are selected by p.p.s

7.3 - Estimator for Cluster Sampling when Primary units are selected by p.p.sThe primary units selected with probabilities proportional to size:

*\(p_i=M_i/M\)*

The Hansen-Hurwitz (p.p.s.) estimator is:

\(\hat{\tau}_p=\dfrac{M}{n}\sum\limits_{i=1}^n \left(\dfrac{y_i}{M_i}\right)\)

Denote by \(\bar{y}_i=\dfrac{y_i}{M_i}\)

\(\hat{V}ar(\hat{\tau}_p)=\dfrac{M^2}{n(n-1)}\sum\limits_{i=1}^n (\bar{y}_i-\hat{\mu}_p)^2\) where

\(\hat{\mu}_p=\dfrac{\hat{\tau}_p}{M}\) is unbiased for \(\mu\).

Thus we also see that:

\(\hat{V}ar(\hat{\mu}_p)=\dfrac{1}{n(n-1)}\sum\limits_{i=1}^n (\bar{y}_i-\hat{\mu}_p)^2\)

**Example**: Estimating population mean per secondary unit when primary units are selected by pps

From the "Total number of computer help requests" example in Lesson 3.1, 3 clusters out of 10 clusters are sampled (*n* = 3) with replacement. The data are:

\(y_1=420, y_2 = 1785, y_3=2198\)

\(M_1=650, M_2=2840, M_3=3200\)

#### Try it!

\begin{align}

\hat{\mu}_p &= \dfrac{1}{n} \sum\limits_{i=1}^n \dfrac{y_i}{M_i}\\

&= \dfrac{1}{3}\times \left(\dfrac{420}{650}+\dfrac{1785}{2840}+\dfrac{2198}{3200}\right)\\

&= 0.6538\\

\end{align}

\begin{align}

\hat{V}ar(\hat{\mu}_p)&=\dfrac{1}{n(n-1)}\sum\limits_{i=1}^n (\bar{y}_i-\hat{\mu}_p)^2\\

&= \dfrac{1}{3 \times 2}[(0.6462-0.6538)^2+(0.6285-0.6538)^2+(0.6869-0.6538)^2]\\

&= 0.000299\\

\end{align}

**Note!**For an example to review an estimate the population total, refer to earlier lecture notes on the Hansen-Hurwitz estimator and the probabilities proportional to size as they were referred to in the Palm Tree total estimator examples.