6.6 - More Examples

Example 1: Handwritten Digit Recognition Section

  • Goal: Identify single digits 0 \(\sim\) 9 based on images.
  • Raw data: Images that are scaled segments from five digit ZIP codes.
  • \(16\times16\) eight-bit grayscale maps
  • Pixel intensities range from 0 (black) to 255 (white)
  • Input data: represent each image as a high-dimensional vector \(x \in \mathbb{R}^{256}\).

PCA can help you to transform the high dimension image data into lower dimension principal components.

Example 2: Face Recognition Section

The cumulative effect of nine principal components, adding one PC at a time, for "sad". The more principal components we use the better resolution we get. However, 4 or 5 principal components lead to a good judgment on a sad expression. It is a dramatic dimension reduction considering the original number of variables which is the number of pixels for a figure.

Why do dimensionality reduction?

  • Computational: compress data \(\Rightarrow\) time/space efficiency.
  • Statistical: fewer dimensions \(\Rightarrow\) better generalization.
  • Visualization: understand the structure of data.
  • Anomaly detection: describe normal data, detect outliers.

When faced with situations involving high-dimensional data, it is natural to consider projecting those data onto a lower-dimensional subspace without losing important information.

  • Variable selection also called feature selection.
  • Shrinkage: Ridge regression and Lasso.
  • Creating a reduced set of linear or nonlinear transformations of the input variables, also called feature extraction, e.g. PCA.

Please finish the quiz for this lesson and the team project on Canvas (check the course schedule for due dates).