Lesson 10: Discriminant Analysis

Overview Section

Discriminant analysis is a classification problem, where two or more groups or clusters or populations are known a priori and one or more new observations are classified into one of the known populations based on the measured characteristics. Let us look at three different examples.

Example 10-1: Swiss Banknotes Section

We have two populations of banknotes, genuine, and counterfeit. Six measures are taken on each note:

  • Length
  • Right-Hand Width
  • Left-Hand Width
  • Top Margin
  • Bottom Margin
  • Diagonal across the printed area

Take a banknote of unknown origin and determine just from these six measurements whether or not it is real or counterfeit. Perhaps this is not as impractical as it might sound. A more modern equivalent is a scanner that would measure the notes automatically and makes a decision.

Example 10-2: Pottery Data Section

Pottery shards are sampled from four sites: L) Llanedyrn, C) Caldicot, I) Ilse Thornes, and A) Ashley Rails, and the concentrations of the following chemical constituents were measured at a laboratory

  • Al: Aluminum
  • Fe: Iron
  • Mg: Magnesium
  • Ca: Calcium
  • Na: Sodium

An archaeologist encounters a pottery specimen of unknown origin. To determine possible trade routes, the archaeologist may wish to classify its site of origin.

Example 10-3: Insect Data Section

Data were collected on two species of insects in the genus Chaetocnema, (a) Ch. concinna and (b) Ch. heikertlingeri. Three variables were measured on each insect:

  • width of the 1st joint of the tarsus (legs)
  • width of the 2nd joint of the tarsus
  • width of the aedeagus (reproductive organ)

Our objective is to obtain a classification rule for identifying the insect species based on these three variables. An entomologist can identify these two closely related species, but the differences are so subtle that one has to have considerable experience to be able to tell the difference. If a classification rule may be developed, then this might be a more accurate way to help differentiate between these two different species.

Objectives

Upon completion of this lesson, you should be able to:

  • Determine whether linear or quadratic discriminant analysis should be applied to a given data set;
  • Be able to carry out both types of discriminant analyses using SAS/Minitab;
  • Be able to apply the linear discriminant function to classify a subject by its measurements;
  • Understand how to assess the efficacy of discriminant analysis.