7.1 - Principal Components Regression (PCR)

In principal components regression, we first perform principal components analysis (PCA) on the original data, then perform dimension reduction by selecting the number of principal components (m) using cross-validation or test set error, and finally conduct regression using the first m dimension reduced principal components.

  • Principal components regression forms the derived input columns \(\mathbf{z}_m = \mathbf{X}\mathbf{v}_m \) and then regresses y on \(z_1, z_2, \cdots , z_m\) for some \(m \leq p\).
  • Principal components regression discards the \(p – m\) smallest eigenvalue components.

By manually setting the projection onto the principal component directions with small eigenvalues set to 0 (i.e., only keeping the large ones), dimension reduction is achieved. PCR is very similar to ridge regression in a certain sense. Ridge regression can be viewed conceptually as projecting the y vector onto the principal component directions and then shrinking the projection on each principal component direction. The amount of shrinkage depends on the variance of that principal component. Ridge regression shrinks everything, but it never shrinks anything to zero. By contrast, PCR either does not shrink a component at all or shrinks it to zero.