11.3 - Estimation of means and totals over subpopulation

Quite often, obtaining a frame that lists only those elements of the population that one is interested in is impossible. For example, perhaps you want to sample households with children, however, the best frame available is a list of all households. Therefore, we wish to estimate the parameters of a subpopulation of the population represented in the frame.

Main Issue: You do not know the size of the subpopulation.

Notation:

  • N - the number of elements in the population
  • \(N_1\)- the number of elements in the subpopulation
  • n - sample size from the population
  • \(n_1\) - the number of sampled elements from the subpopulation
  • \(y_{1j}\) - the jth sampled observation that falls in the subpopulation

An unbiased estimator of \(\mu_1\), the subpopulation mean is:

\(\bar{y}_1=\dfrac{1}{n_1}\sum\limits_{j=1}^{n_1} y_{ij}\)

Its variance is estimated by:

\(\hat{V}ar(\bar{y}_1)=\left(\dfrac{N_1-n_1}{N_1}\right)\dfrac{s^2_1}{n_1}\)

\(\text{where } s^2_1=\dfrac{\sum\limits_{j=1}^{n_1} (y_{ij}-\bar{y}_1)^2}{n_1-1}\)

Usually, we do not know \(N_1\), so we will estimate the finite population correction factor as :

\(\dfrac{N_1-n_1}{N_1} \text{ by } \dfrac{N-n}{N}\)

Example 11-4: Amount spent on food Section

Let's say we want to estimate the average weekly amount spent on food by married graduate students in a certain college at Penn State. There are 80 graduate students in the college. 15 are sampled and 10 are married. A summary of the data follows:

Variable marital status N Mean SE Mean StDev
food cost m 10 135.3 14.1 44.4
  s 5 87.60 9.73 21.76

Try it!

What is the average food cost for married students in that college at Penn State? Provide an estimate for the standard deviation for the estimate.

The average food cost for married students is:

\(\bar{y}_m=135.3\)

An estimate for the standard deviation for the estimate is:

\(\hat{V}ar(\bar{y}_m)=\dfrac{80-15}{80}\cdot \dfrac{44.4^2}{10}=160.173\)

\(\hat{S}D(\bar{y}_m)=12.656\)