Quite often, obtaining a frame that lists only those elements of the population that one is interested in is impossible. For example, perhaps you want to sample households with children, however, the best frame available is a list of all households. Therefore, we wish to estimate the parameters of a subpopulation of the population represented in the frame.
Main Issue: You do not know the size of the subpopulation.
Notation:
- N - the number of elements in the population
- \(N_1\)- the number of elements in the subpopulation
- n - sample size from the population
- \(n_1\) - the number of sampled elements from the subpopulation
- \(y_{1j}\) - the jth sampled observation that falls in the subpopulation
An unbiased estimator of \(\mu_1\), the subpopulation mean is:
\(\bar{y}_1=\dfrac{1}{n_1}\sum\limits_{j=1}^{n_1} y_{ij}\)
Its variance is estimated by:
\(\hat{V}ar(\bar{y}_1)=\left(\dfrac{N_1-n_1}{N_1}\right)\dfrac{s^2_1}{n_1}\)
\(\text{where } s^2_1=\dfrac{\sum\limits_{j=1}^{n_1} (y_{ij}-\bar{y}_1)^2}{n_1-1}\)
Usually, we do not know \(N_1\), so we will estimate the finite population correction factor as :
\(\dfrac{N_1-n_1}{N_1} \text{ by } \dfrac{N-n}{N}\)
Example 11-4: Amount spent on food Section
Let's say we want to estimate the average weekly amount spent on food by married graduate students in a certain college at Penn State. There are 80 graduate students in the college. 15 are sampled and 10 are married. A summary of the data follows:
Variable | marital status | N | Mean | SE Mean | StDev |
---|---|---|---|---|---|
food cost | m | 10 | 135.3 | 14.1 | 44.4 |
s | 5 | 87.60 | 9.73 | 21.76 |
Try it!
The average food cost for married students is:
\(\bar{y}_m=135.3\)
An estimate for the standard deviation for the estimate is:
\(\hat{V}ar(\bar{y}_m)=\dfrac{80-15}{80}\cdot \dfrac{44.4^2}{10}=160.173\)
\(\hat{S}D(\bar{y}_m)=12.656\)