2.3 - Population Variance

Linear combinations not only have a population mean but they also have a population variance. The population variance of a linear combination is expressed as the following double sum of j = 1 to p and k = 1 to p over all pairs of variables.

\(var(Y) =\sum_{j=1}^{p}\sum_{k=1}^{p}c_jc_k\sigma_{jk}= c ^ { \prime } \Sigma c\)

In each term within the double sum, the product of the paired coefficients \(c_{j}\) times \(c_{k}\) is multiplied by the covariance between the \(j^{th}\) and \(k^{th}\) variables. If \(\Sigma\) is the variance-covariance matrix of \(\mathbf{X}\), then \(Var(Y) = c ^ { \prime } \Sigma c\) .

Expressions of vectors and matrices of this form are called quadratic forms.

When using this expression, the covariance between the variables and itself, or \(\sigma_{jj}\) is simply equal to the variance of the \(j^{th}\) variable, or \(\sigma_{j}^{2 }\).

\( \sigma_{jj} = \sigma^2_j\)

The variance of the random variable y can be estimated by the sample variances or s squared Y. This is obtained by substituting the sample variances and covariances for the population variances and covariances as shown in the expression below.

\(s^2_Y = \sum_{j=1}^{p}\sum_{k=1}^{p}c_jc_ks_{jk} =c ^ { \prime } S c\)

A simplified calculation can be found below. This involves two terms.

Population variance of linear combinations: \(s^2_Y = \sum_{j=1}^{p}c^2_j s^2_j +2\sum_{j<k}c_jc_ks_{jk}\)

The first term involves summing over all the variables. Here we take the squared coefficients and multiply them by their respective variances. In the second term, we sum over all unique pairs of variables j less than k. Again take the product of \(c_{j}\) times \(c_{k}\) times the covariances between variables j and k. Since each unique pair appears twice in the original expression, we must multiply the sum by 2.

Example 2-4: Women’s Health Survey (Population Variance)

Looking at the Women's Nutrition survey data we obtained the following variance/covariance matrix as shown below from the previous lesson.

\(S = \left(\begin{array}{RRRRR}157829.4 & 940.1 & 6075.8 & 102411.1 & 6701.6 \\ 940.1 & 35.8 & 114.1 & 2383.2 & 137.7 \\ 6075.8 & 114.1 & 934.9 & 7330.1 & 477.2 \\ 102411.1 & 2383.2 & 7330.1 & 2668452.4 & 22063.3 \\ 6701.6 & 137.7 & 477.2 & 22063.3 & 5416.3 \end{array}\right)\)

If we wanted to take a look at the total intake of vitamins A and C (in mg) remember we defined this earlier as:

\(Y = 0.001 X _ { 4 } + X _ { 5 }\)

Therefore the sample variance of Y is equal to \((0.001)^{2}\) times the variance for \(X_{4}\), plus the variance for \(X_{5}\), plus 2 times 0.001 times the covariance between \(X_{4}\) and \(X_{5}\). The next few lines carry out the mathematical calculations using these values.

\begin{align} s^2_Y &= 0.001^2s^2_4 + s^2_5 + 2 \times 0.001s_{45}\\ &= 0.000001 \times 2668452.4 + 5416.3 + 0.002 \times 22063.3\\ &= 2.7 + 5416.3 + 44.1 \\ &= 5463.1 \end{align}

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility