1.3 - Estimating Population Mean and Total under SRS

Simple Random Sampling
Random sampling without replacement such that every possible sample of n units has the same probability of selection.

Example 1-2: Patient Records Section

A hospital has 1,125 patient records. How can one randomly select 120 records to review?

Answer

Assign a number from 1 to 1,125 to each record and randomly select 120 numbers from 1 to 1,125 without replacement.

 In Minitab use the following commands:
  1. Calc > Make Patterned Data > Simple Set of Numbers
  2. Then, Calc > Random Data > Sample From Columns... ( without replacement is the default)

Example 1-3: Total Number of Beetles Section

To estimate the total number of beetles in an agricultural field. Subdivide the field into 100 equally sized units. Take a simple random sample of eight units and count the number of beetles in these eight units.

1 11 21 31 41 51 61 71 81 91
2 12 22 32 42 52 62 72 82 92
3 13 23 33 43 53 63 73 83 93
4 14 24 34 44 54 64 74 84 94
5 15 25 35 45 55 65 75 85 95
6 16 26 36 46 56 66 76 86 96
7 17 27 37 47 57 67 77 87 97
8 18 28 38 48 58 68 78 88 98
9 19 29 39 49 59 69 79 89 99
10 20 30 40 50 60 70 80 90 100

Field divided into 100 equal sized units

  C1 C2
     
1 1 46
2 2 100
3 3 51
4 4 15
5 5 30
6 6 91
7 7 94
8 8 73
9 9  
10 10  
11 11  
12 12  
13 13  

C1 contains the list of numbers 1 - 100 and C2 contains the 8 random numbers generated by Minitab

1 11 21 31 41 51 61 71 81 91
2 12 22 32 42 52 62 72 82 92
3 13 23 33 43 53 63 73 83 93
4 14 24 34 44 54 64 74 84 94
5 15 25 35 45 55 65 75 85 95
6 16 26 36 46 56 66 76 86 96
7 17 27 37 47 57 67 77 87 97
8 18 28 38 48 58 68 78 88 98
9 19 29 39 49 59 69 79 89 99
10 20 30 40 50 60 70 80 90 100

N = 100 population size, n = 8 sample size

Notation

Let Yi denote the number of beetles in the ith unit. N denotes the number of units in the population.

Variable of interest: Y1, ... , YN

\(\mu=\dfrac{y_1+y_2+\ldots +y_N}{N}\) (the population mean)

\(\tau=y_1+y_2+\ldots +y_N=N \times \mu\) (the population total)

\(\mbox{sample mean}=\bar{y}=\hat{\mu}=\dfrac{y_1+y_2+\ldots +y_n}{n}\)

\(\mbox{estimate for population total}=\hat{\tau}=N \times \bar{y}\) (expansion estimator)

 

Finite population variance:

\(\sigma^2=\sum\limits^N_{i=1} \dfrac{(y_i-\mu)^2}{N-1}\)

\(\sigma^2\) can be estimated by sample variance \(s^2\)

\(s^2=\sum\limits^n_{i=1}\dfrac{(y_i-\bar{y})^2}{n-1}\)

Sample standard deviation: \(s=\sqrt{s^2}\)

Example 1-3b: Total Number of Beetles Section

For the beetle example, the observed samples at the eight fields are: 234, 256, 128, 245, 211, 240, 202, 267

  •  \(\bar{y}=222.875\)
  • \(s^2=1932.657\)
  • \(s=43.962\)

Estimate for population total is:

\begin{eqnarray}
  \hat{\tau} &=& N \times \bar{y} \nonumber\\
         &=& 100 \times 222.875 \nonumber\\
         &=& 22287.5  \nonumber
\end{eqnarray}

Properties of \(\bar{y}\) when one uses simple random sampling Section

(i) unbiased

\begin{eqnarray}
  E(\bar{y}) &=& E\left( \dfrac{y_1+y_2+\ldots +y_n}{n} \right) \nonumber\\
&=& \dfrac{E(y_1)+E(y_2)+\ldots+E(y_n)}{n} \nonumber\\
&=& \dfrac{\mu+\mu+\ldots +\mu}{n} \nonumber\\
&=& \mu  \nonumber
\end{eqnarray}

(ii) here under simple random sampling, we can estimate the variance of  \(\bar{y}\) from a single sample as:

\(Var(\bar{y})=\dfrac{N-n}{N} \cdot \dfrac{\sigma^2}{n}\)

Note that \(\frac{N-n}{N}=1-\frac{n}{N}\) is called the finite population correction fraction.

Note! When the sampling is done with replacement, the fraction disappears. Also, when the sample size is very small compared to the population size, the fraction will disappear too.

 Read Chapter 2.6 for the proof of \(Var(\bar{y})=\dfrac{N-n}{N}\cdot \dfrac{\sigma^2}{n}\)

If one wants to estimate \(Var(\bar{y})\), one needs to estimate \(\sigma^2\) by \(s^2\) in the formula. The estimate for \(Var(\bar{y})\) is denoted as \(\hat{V}ar(\bar{y})\)and \(\hat{V}ar(\bar{y})=\dfrac{N-n}{N}\cdot\dfrac{s^2}{n}\).

Try it!

Estimate \(Var(\bar{y})\) for the data in example 1-3 (beetle example).

 \begin{eqnarray}
  \hat{V}ar(\bar{y})&=& \dfrac{N-n}{N}\cdot \dfrac{s^2}{n} \nonumber\\
&=& \dfrac{100-8}{100}\cdot \dfrac{1932.657}{8} \nonumber\\
&=& 222.256  \nonumber
\end{eqnarray}

Properties of \(\hat{\tau}\) with a simple random sampling: Section

(i) unbiased

\begin{eqnarray}
  E(\hat{\tau})&=& E(N \times \bar{y}) \nonumber\\
&=& N \times \mu \nonumber\\
&=& \tau  \nonumber
\end{eqnarray}

(ii) formula for \(Var(\hat{\tau})\) is:

\begin{eqnarray}
  Var(\hat{\tau})&=& Var(N \times \bar{y}) \nonumber\\
&=& N^2 \cdot Var(\bar{y}) \nonumber\\
&=& N^2 \cdot \dfrac{N-n}{N} \cdot \dfrac{\sigma^2}{n} \nonumber\\
&=& N \cdot (N-n) \cdot \dfrac{\sigma^2}{n}  \nonumber
\end{eqnarray}

The estimate for \(Var(\hat{\tau})\) is thus: \(\hat{V}ar(\hat{\tau})=N(N-n)\dfrac{s^2}{n}\)

Try it!

Estimate the variance of \(\hat{\tau}\) for the data on the number of beetles.

\begin{eqnarray}
  \hat{V}ar(\hat{\tau})&=& 100 \cdot (100-8) \cdot \frac{1932.657}{8} \nonumber\\
&=& 2222560 \nonumber\\
&=& N^2 \cdot \hat{V}ar(\bar{y}) \nonumber
\end{eqnarray}