12.4  Example: Places Rated Data  Principal Component Method
12.4  Example: Places Rated Data  Principal Component MethodExample 121: Places Rated
Let's revisit the Places Rated Example from Lesson 11. Recall that the Places Rated Almanac (Boyer and Savageau) rates 329 communities according to nine criteria:
 Climate and Terrain
 Housing
 Health Care & Environment
 Crime
 Transportation
 Education
 The Arts
 Recreation
 Economic
Except for housing and crime, the higher the score the better.For housing and crime, the lower the score the better.
Our objective here is to describe the relationships among the variables.
Before carrying out a factor analysis we need to determine m. How many common factors should be included in the model? This requires a determination of how may parameters will be involved.
For p = 9, the variancecovariance matrix \(\Sigma\) contains
\(\dfrac{p(p+1)}{2} = \dfrac{9 \times 10}{2} = 45\)
unique elements or entries. For a factor analysis with m factors, the number of parameters in the factor model is equal to
\(p(m+1) = 9(m+1)\)
Taking m = 4, we have 45 parameters in the factor model, this is equal to the number of original parameters, This would result in no dimension reduction. So in this case, we will select m = 3, yielding 36 parameters in the factor model and thus a dimension reduction in our analysis.
It is also common to look at the results of the principal components analysis. The output from Lesson 11.6 is below. The first three components explain 62% of the variation. We consider this to be sufficient for the current example and will base future analyses on three components.
Component  Eigenvalue  Proportion  Cumulative 
1  3.2978  0.3664  0.3664 
2  1.2136  0.1348  0.5013 
3  1.1055  0.1228  0.6241 
4  0.9073  0.1008  0.7249 
5  0.8606  0.0956  0.8205 
6  0.5622  0.0625  0.8830 
7  0.4838  0.0538  0.9368 
8  0.3181  0.0353  0.9721 
9  0.2511  0.0279  1.0000 
We need to select m so that a sufficient amount of variation in the data is explained. What is sufficient is, of course, subjective and depends on the example at hand.
Alternatively, often in social sciences, the underlying theory within the field of study indicates how many factors to expect. In psychology, for example, a circumplex model suggests that mood has two factors: positive affect and arousal. So a twofactor model may be considered for questionnaire data regarding the subjects' moods. In many respects, this is a better approach because then you are letting the science drive the statistics rather than the statistics drive the science! If you can, use your or a field expert's scientific understanding to determine how many factors should be included in your model.
Using SAS
The factor analysis is carried out using the program as shown below:
Download the SAS Program here: places2.sas
View the video explanation of the SAS code.Using Minitab
View the video below to see how to perform a factor analysis using the Minitab statistical software application.
Initially, we will look at the factor loadings. The factor loadings are obtained by using this expression
\(\hat{e}_{i}\sqrt{ \hat{\lambda}_{i}}\)
These are summarized in the table below. The factor loadings are only recorded for the first three factors because we set m=3. We should also note that the factor loadings are the correlations between the factors and the variables. For example, the correlation between the Arts and the first factor is about 0.86. Similarly, the correlation between climate and that factor is only about 0.28.
Factor  
Variable  1  2  3 
Climate  0.286  0.076  0.841 
Housing  0.698  0.153  0.084 
Health  0.744  0.410  0.020 
Crime  0.471  0.522  0.135 
Transportation  0.681  0.156  0.148 
Education  0.498  0.498  0.253 
Arts  0.861  0.115  0.011 
Recreation  0.642  0.322  0.044 
Economics  0.298  0.595  0.533 
Interpreting factor loadings is similar to interpreting the coefficients for principal component analysis. We want to determine some inclusion criterion, which in many instances, may be somewhat arbitrary. In the above table, the values that we consider large are in boldface, using about .5 as the cutoff. The following statements are based on this criterion:

Factor 1 is correlated most strongly with Arts (0.861) and also correlated with Health, Housing, Recreation, and to a lesser extent Crime and Education. You can say that the first factor is primarily a measure of these variables.

Similarly, Factor 2 is correlated most strongly with Crime, Education, and Economics. You can say that the second factor is primarily a measure of these variables.

Likewise, Factor 3 is correlated most strongly with Climate and Economics. You can say that the first factor is primarily a measure of these variables.
The interpretation above is very similar to that obtained in the standardized principal component analysis.