Example 4-4: Wechsler Adult Intelligence Scale Section
Here we have data on n = 37 subjects taking the Wechsler Adult Intelligence Test. This test is broken up into four different components:
- Information (Info)
- Similarities (Sim)
- Arithmetic (Arith)
- Picture Completion (Pic)
The data are stored in five different columns. The first column is the ID number of the subjects, followed by the four component tasks in the remaining four columns.
Download the txt file: wechsler.csv
These data may be analyzed using the SAS program shown below.
Download the SAS file: wechsler.sas
Note: In the upper right-hand corner of the code block you will have the option of copying ( ) the code to your clipboard or downloading ( ) the file to your computer.
options ls=78;
title "Eigenvalues and Eigenvectors - Wechsler Data";
/* The first two lines define the name of the data set with the name 'wechsler'
* and specify the path where the contents of the data set are read from.
* Since we have a header row, the first observation begins on the 2nd row,
* and the delimiter option is needed because columns are separated by commas.
* The input statement is where we provide names for the variables in order
* of the columns in the data set. If any were categorical (not the case here),
* we would need to put a '$' character after its name.
*/
data wechsler;
infile "D:\Statistics\STAT 505\data\wechsler.csv" firstobs=2 delimiter=',';
input id info sim arith pict;
run;
/* This prints the specified variable(s) from the data set 'wechsler'.
* Since no variables are specified, all are printed.
*/
proc print data=wechsler;
run;
/* The princomp procedure calculates the eigenvalues and eigenvectors
* for the variables specified in the var statement. The default is
* to operate on the correlation matrix of the data, but the 'cov' option
* indicates to use the covariance matrix instead.
*/
proc princomp data=wechsler cov;
var info sim arith pict;
run;
Walk through the procedures of the program by clicking on the "Explore the code" button. Just as in previous lessons, marking up a printout of the SAS program is also a good strategy for learning how this program is put together.
The SAS output, (download below), gives the results of the data analyses. Because the SAS output is usually a relatively long document, printing these pages of output out and marking them with notes is highly recommended if not required!!
Download the SAS output here: wechsler.lst
Produce the Covariance Matrix for the Wechsler Adult Intelligence Test Data
To find the sample covariance matrix of a multivariate data set:
- Stat > Basic Statistics > Covariance
- Highlight and select the names of all the variables of interest to move them into the window on the right.
- Check the box for ‘Store matrix’.
- Select ‘OK’. No results are displayed at this point.
- Data > Display Data
- Highlight and select M1 and choose ‘Select’ to move it into the window on the right.
- Select ‘OK’ to display the sample covariance matrix.
Analysis
We obtain the following sample means.
Variable | Mean |
---|---|
Information | 12.568 |
Similarities | 9.568 |
Arithmetic | 11.486 |
Picture Completion | 7.973 |
Variance-Covariance Matrix
\(\textbf{S} = \left(\begin{array}{rrrr}11.474 & 9.086 & 6.383 & 2.071\\ 9.086 & 12.086 & 5.938 & 0.544\\ 6.383 & 5.938 & 11.090 & 1.791\\ 2.071 & 0.544 & 1.791 & 3.694 \end{array}\right)\)
Here, for example, the variance for Information was 11.474. For Similarities, it was 12.086. The covariance between Similarities and Information is 9.086. The total variance, which is the sum of the variances comes out to be 38.344, approximately.
The eigenvalues are given below:
\(\lambda_1 = 26.245\), \(\lambda_2 = 6.255\), \(\lambda_3 = 3.932\), \(\lambda_4 = 1.912\)
and finally, at the bottom of the table, we have the corresponding eigenvectors. They have been listed here below:
\(\mathbf{e_1}=\left(\begin{array}{r}0.606\\0.605\\0.505\\0.110 \end{array}\right)\), \(\mathbf{e_2}=\left(\begin{array}{r}-0.218\\-0.496\\0.795\\0.274 \end{array}\right)\), \(\mathbf{e_3}=\left(\begin{array}{r}0.461\\-0.320\\-0.335\\0.757 \end{array}\right)\), \(\mathbf{e_4}=\left(\begin{array}{r}-0.611\\0.535\\-0.035\\0.582 \end{array}\right)\)
For example, the eigenvectors corresponding to the eigenvalue 26.245, those elements are 0.606, 0.605, 0.505, and 0.110.
Now, let's consider the shape of the 95% prediction ellipse formed by the multivariate normal distribution whose variance-covariance matrix is equal to the sample variance-covariance matrix we just obtained.
Recall the formula for the half-lengths of the axis of this ellipse. This is equal to the square root of the eigenvalue times the critical value from a chi-square table. In this case, we need the chi-square with four degrees of freedom because we have four variables. For a 95% prediction ellipse, the chi-square with four degrees of freedom is equal to 9.49.
For looking at the first and longest axis of a 95% prediction ellipse, we substitute 26.245 for the largest eigenvalue, multiplied by 9.49, and take the square root. We end up with a 95% prediction ellipse with a half-length of 15.782 as shown below:
\begin{align} l_1 &= \sqrt{\lambda_1\chi^2_{4,0.05}}\\ &= \sqrt{26.245 \times 9.49}\\ &= 15.782 \end{align}
The direction of the axis is given by the first eigenvector. Looking at this first eigenvector we can see large positive elements corresponding to the first three variables. In other words, large elements for Information, Similarities, and Arithmetic. This suggests that this particular axis points in the direction specified by \(e_{1}\); that is, increasing values of Information, Similarities, and Arithmetic.
The half-length of the second longest axis can be obtained by substituting 6.255 for the second eigenvalue, multiplying this by 9.49, and taking the square root. We obtain a half-length of about 7.7 or about half the length of the first axis.
\begin{align} l_2 &= \sqrt{\lambda_2\chi^2_{4,0.05}}\\ &= \sqrt{6.255 \times 9.49}\\ &= 7.705 \end{align}
So, if you were to picture this particular ellipse you would see that the second axis is about half the length of the first and longest axis.
Looking at the corresponding eigenvector, \(e_{2}\), we can see that this particular axis is pointed in the direction of points in the direction of increasing values for the third value, or Arithmetic and decreasing value for Similarities, the second variable.
Similar calculations can then be carried out for the third-longest axis of the ellipse as shown below:
\begin{align} l_3 &= \sqrt{\lambda_1\chi^2_{4,0.05}}\\ &= \sqrt{3.931 \times 9.49}\\ &= 6.108 \end{align}
This third axis has a half-length of 6.108, which is not much shorter or smaller than the second axis. It points in the direction of \(e_{3}\)that is, increasing values of Picture Completion and Information, and decreasing values of Similarities and Arithmetic.
The shortest axis has a half-length of about 4.260 as shown below:
\begin{align} l_4 &= \sqrt{\lambda_4\chi^2_{4,0.05}}\\ &= \sqrt{1.912 \times 9.49}\\ &= 4.260 \end{align}
It points in the direction of \(e_{4}\) that is, increasing values of Similarities and Picture Completion, and decreasing values of Information.
The overall shape of the ellipse can be obtained by comparing the lengths of the various axis. What we have here is basically an ellipse that is the shape of a slightly squashed football.
We can also obtain the volume of the hyper-ellipse using the formula that was given earlier. Again, our critical value from the chi-square, if we are looking at a 95% prediction ellipse, with four degrees of freedom is given at 9.49. Substituting into our expression we have the product of the eigenvalues in the square root. The gamma function is evaluated at 2, and a gamma of 2 is simply equal to 1. Carrying out the math we end up with a volume of 15,613.132 as shown below:
\begin{align} \frac{2\pi^{p/2}}{p\Gamma\left(\frac{p}{2}\right)}(\chi^2_{p,\alpha})^{p/2}|\Sigma|^{1/2} &= \frac{2\pi^{p/2}}{p\Gamma\left(\frac{p}{2}\right)}(\chi^2_{p,\alpha})^{p/2}\sqrt{\prod_{j=1}^{p}\lambda_j} \\[10pt] &= \frac{2\pi^2}{4\Gamma(2)}(9.49)^2\sqrt{26.245 \times 6.255 \times 3.932 \times 1.912}\\[10pt] &= 444.429 \sqrt{1234.17086}\\[10pt] &= 15613.132\end{align}