1.3 - Estimating Population Mean and Total under SRS
1.3 - Estimating Population Mean and Total under SRS- Simple Random Sampling
- Random sampling without replacement is such that every possible sample of n units has the same probability of selection.
Example 1-2: Patient Records
A hospital has 1,125 patient records. How can one randomly select 120 records to review?
Answer
Assign a number from 1 to 1,125 to each record and randomly select 120 numbers from 1 to 1,125 without replacement.
In Minitab use the following commands:- Calc > Make Patterned Data > Simple Set of Numbers
- Then, Calc > Random Data > Sample From Columns... ( without replacement is the default)
Example 1-3: Total Number of Beetles
To estimate the total number of beetles in an agricultural field. Subdivide the field into 100 equally sized units. Take a simple random sample of eight units and count the number of beetles in these eight units.
1 | 11 | 21 | 31 | 41 | 51 | 61 | 71 | 81 | 91 |
---|---|---|---|---|---|---|---|---|---|
2 | 12 | 22 | 32 | 42 | 52 | 62 | 72 | 82 | 92 |
3 | 13 | 23 | 33 | 43 | 53 | 63 | 73 | 83 | 93 |
4 | 14 | 24 | 34 | 44 | 54 | 64 | 74 | 84 | 94 |
5 | 15 | 25 | 35 | 45 | 55 | 65 | 75 | 85 | 95 |
6 | 16 | 26 | 36 | 46 | 56 | 66 | 76 | 86 | 96 |
7 | 17 | 27 | 37 | 47 | 57 | 67 | 77 | 87 | 97 |
8 | 18 | 28 | 38 | 48 | 58 | 68 | 78 | 88 | 98 |
9 | 19 | 29 | 39 | 49 | 59 | 69 | 79 | 89 | 99 |
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
The field divided into 100 equal-sized units
C1 | C2 | |
---|---|---|
1 | 1 | 46 |
2 | 2 | 100 |
3 | 3 | 51 |
4 | 4 | 15 |
5 | 5 | 30 |
6 | 6 | 91 |
7 | 7 | 94 |
8 | 8 | 73 |
9 | 9 | |
10 | 10 | |
11 | 11 | |
12 | 12 | |
13 | 13 |
C1 contains the list of numbers 1 - 100 and C2 contains the 8 random numbers generated by Minitab
1 | 11 | 21 | 31 | 41 | 51 | 61 | 71 | 81 | 91 |
---|---|---|---|---|---|---|---|---|---|
2 | 12 | 22 | 32 | 42 | 52 | 62 | 72 | 82 | 92 |
3 | 13 | 23 | 33 | 43 | 53 | 63 | 73 | 83 | 93 |
4 | 14 | 24 | 34 | 44 | 54 | 64 | 74 | 84 | 94 |
5 | 15 | 25 | 35 | 45 | 55 | 65 | 75 | 85 | 95 |
6 | 16 | 26 | 36 | 46 | 56 | 66 | 76 | 86 | 96 |
7 | 17 | 27 | 37 | 47 | 57 | 67 | 77 | 87 | 97 |
8 | 18 | 28 | 38 | 48 | 58 | 68 | 78 | 88 | 98 |
9 | 19 | 29 | 39 | 49 | 59 | 69 | 79 | 89 | 99 |
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 | 90 | 100 |
N = 100 population size, n = 8 sample size
Notation
Let Yi denote the number of beetles in the ith unit. N denotes the number of units in the population.
Variable of interest: Y1, ... , YN
\(\mu=\dfrac{y_1+y_2+\ldots +y_N}{N}\) (the population mean)
\(\tau=y_1+y_2+\ldots +y_N=N \times \mu\) (the population total)
\(\mbox{sample mean}=\bar{y}=\hat{\mu}=\dfrac{y_1+y_2+\ldots +y_n}{n}\)
\(\mbox{estimate for population total}=\hat{\tau}=N \times \bar{y}\) (expansion estimator)
Finite population variance:
\(\sigma^2=\sum\limits^N_{i=1} \dfrac{(y_i-\mu)^2}{N-1}\)
\(\sigma^2\) can be estimated by sample variance \(s^2\)
\(s^2=\sum\limits^n_{i=1}\dfrac{(y_i-\bar{y})^2}{n-1}\)
Sample standard deviation: \(s=\sqrt{s^2}\)
Example 1-3b: Total Number of Beetles
For the beetle example, the observed samples at the eight fields are: 234, 256, 128, 245, 211, 240, 202, 267
- \(\bar{y}=222.875\)
- \(s^2=1932.657\)
- \(s=43.962\)
The estimate for the population total is:
\begin{eqnarray}
\hat{\tau} &=& N \times \bar{y} \nonumber\\
&=& 100 \times 222.875 \nonumber\\
&=& 22287.5 \nonumber
\end{eqnarray}
Properties of \(\bar{y}\) when one uses simple random sampling
(i) unbiased
\begin{eqnarray}
E(\bar{y}) &=& E\left( \dfrac{y_1+y_2+\ldots +y_n}{n} \right) \nonumber\\
&=& \dfrac{E(y_1)+E(y_2)+\ldots+E(y_n)}{n} \nonumber\\
&=& \dfrac{\mu+\mu+\ldots +\mu}{n} \nonumber\\
&=& \mu \nonumber
\end{eqnarray}
(ii) here under simple random sampling, we can estimate the variance of \(\bar{y}\) from a single sample as:
\(Var(\bar{y})=\dfrac{N-n}{N} \cdot \dfrac{\sigma^2}{n}\)
Note that \(\frac{N-n}{N}=1-\frac{n}{N}\) is called the finite population correction fraction.
Read Chapter 2.6 for the proof of \(Var(\bar{y})=\dfrac{N-n}{N}\cdot \dfrac{\sigma^2}{n}\)
If one wants to estimate \(Var(\bar{y})\), one needs to estimate \(\sigma^2\) by \(s^2\) in the formula. The estimate for \(Var(\bar{y})\) is denoted as \(\hat{V}ar(\bar{y})\)and \(\hat{V}ar(\bar{y})=\dfrac{N-n}{N}\cdot\dfrac{s^2}{n}\).
Try it!
\begin{eqnarray}
\hat{V}ar(\bar{y})&=& \dfrac{N-n}{N}\cdot \dfrac{s^2}{n} \nonumber\\
&=& \dfrac{100-8}{100}\cdot \dfrac{1932.657}{8} \nonumber\\
&=& 222.256 \nonumber
\end{eqnarray}
Properties of \(\hat{\tau}\) with a simple random sampling:
(i) unbiased
\begin{eqnarray}
E(\hat{\tau})&=& E(N \times \bar{y}) \nonumber\\
&=& N \times \mu \nonumber\\
&=& \tau \nonumber
\end{eqnarray}
(ii) formula for \(Var(\hat{\tau})\) is:
\begin{eqnarray}
Var(\hat{\tau})&=& Var(N \times \bar{y}) \nonumber\\
&=& N^2 \cdot Var(\bar{y}) \nonumber\\
&=& N^2 \cdot \dfrac{N-n}{N} \cdot \dfrac{\sigma^2}{n} \nonumber\\
&=& N \cdot (N-n) \cdot \dfrac{\sigma^2}{n} \nonumber
\end{eqnarray}
The estimate for \(Var(\hat{\tau})\) is thus: \(\hat{V}ar(\hat{\tau})=N(N-n)\dfrac{s^2}{n}\)
Try it!
\begin{eqnarray}
\hat{V}ar(\hat{\tau})&=& 100 \cdot (100-8) \cdot \frac{1932.657}{8} \nonumber\\
&=& 2222560 \nonumber\\
&=& N^2 \cdot \hat{V}ar(\bar{y}) \nonumber
\end{eqnarray}