Capture the intrinsic variability in the data.
Reduce the dimensionality of a data set, either to ease interpretation or as a way to avoid overfitting and to prepare for subsequent analysis.
The sample covariance matrix of X is S=XTX/N, since X has zero mean.
Eigen decomposition of XTX:
XTX=(UDVT)T(UDVT)=VDTUTUDVT=VD2VT
The eigenvectors of XTX (i.e., vj, j = 1, …, p) are called principal component directions of X.
The first principal component direction v1 has the following properties that
The second principal component direction v2 (the direction orthogonal to the first component that has the largest projected variance) is the eigenvector corresponding to the second largest eigenvalue, d22 , of XTX, and so on. (The eigenvector for the kth largest eigenvalue corresponds to the kth principal component direction vk.)
The kth principal component of X, zk, has maximum variance d21/N, subject to being orthogonal to the earlier ones.