6.4 - Geometric Interpretation

Principal components analysis (PCA) projects the data along the directions where the data varies the most.

PCA plotThe first direction is decided by \(\mathbf{v}_1\) corresponding to the largest eigenvalue \(d_1^2\).

The second direction is decided by \(\mathbf{v}_2\) corresponding to the second largest eigenvalue \(d_2^2\).

The variance of the data along the principal component directions is associated with the magnitude of the eigenvalues.

Choice of How Many Components to Extract

Scree Plot – This is a useful visual aid which shows the amount of variance explained by each consecutive eigenvalue.

The choice of how many components to extract is fairly arbitrary.

When conducting principal components analysis prior to further analyses, it is risky to choose too small a number of components, which may fail to explain enough of the variability in the data.