Example 106: Swiss Bank notes Section
Recall that we have two populations of notes, genuine and counterfeit, and that six measurements were taken on each note:
 Length
 RightHand Width
 LeftHand Width
 Top Margin
 Bottom Margin
 Diagonal
Priors
In this case it would not be reasonable to consider equal priors for the two types of banknotes. Equal priors would assume that half the banknotes in circulation are counterfeit and half are genuine. This is a very high counterfeit rate and if it was that bad the Swiss government would probably be bankrupt! We need to consider unequal priors in which the vast majority of banknotes are thought to be genuine. For this example let us assume that no more than 1% of bank notes in circulation are counterfeit and 99% of the notes are genuine. The prior probabilities can then be expressed as:
\(\hat{p}_1 = 0.99\) and \(\hat{p}_2 = 0.01\)
The first step in the analysis is going to carry out Bartlett's test to check for homogeneity of the variancecovariance matrices.
Download the text file with the data here: swiss1.txt
Using SAS
To do this we will use the SAS program shown below:
Download the SAS program here: swiss9.sas
View the video explanation of the SAS code.SAS Notes
By default, SAS will make this decision for you. Let's look at the proc descrim procedure in the SAS Program that we just used.
By including pool=test, SAS will decide what kind of discriminant analysis to carry out based on the results of this test.
If the test fails to reject, then SAS will automatically do a linear discriminant analysis. If the test rejects, then SAS will do a quadratic discriminant analysis.
There are two other options here. If we put pool=yes then SAS will conduct a linear discriminant analysis whether it is warranted or not. It will pool the variancecovariance matrices and do a linear discriminant analysis without reporting Bartlett's test.
If pool=no then SAS will not pool the variancecovariance matrices and perform the quadratic discriminant analysis.
SAS does not actually print out the quadratic discriminant function, but it will use quadratic discriminant analysis to classify sample units into populations.
Using Minitab
View the video below to see how discriminant analysis is performed using the Minitab statistical software application.
Bartlett's Test finds a significant difference between the variancecovariance matrices of the genuine and counterfeit bank notes \(\left( \mathrm { L } ^ { \prime } = 121.90 ; \mathrm { d.f. } = 21 ; \mathrm { p } < 0.0001 \right)\). The variancecovariance matrix for the genuine notes is not equal to the variancecovariance matrix for the counterfeit notes. Because we reject the null hypothesis of equal variancecovariance matrices, this suggests that a linear discriminant analysis is not appropriate for these data. A quadratic discriminant analysis is necessary.
Example 107: Swiss Bank notes Section
Let us consider a bank note with the following measurements:
Variable

Measurement

Length

214.9

Left Width

130.1

Right Width

129.9

Bottom Margin

9.0

Top Margin

10.6

Diagonal

140.5

Any number of lines of measurements may be considered. Here we are just interested in one set of measurements. It is requested that this bank note be classified as real or genuine. The posterior probability that it is fake or counterfeit is only 0.000002526. So, the posterior probability that it is genuine is very close to one (actually, this posterior probability is 1  0.000002526 = 0.999997474). We are nearly 100% confident that this is a real note and not counterfeit.
Next consider the results of crossvalidation.
The resulting confusion table is as follows:
Classified As


Truth  Counterfeit  Genuine  Total 
Counterfeit 
98

2

100

Genuine 
1

99

100

Total 
99

101

200

Here, we can see that 98 out of 100 counterfeit notes are expected to be correctly classified, while 99 out of 100 genuine notes are expected to be correctly classified.Thus, the estimated misclassification probabilities are estimated to be:
\(\hat{p}(\text{real  fake}) = 0.02 \) and \(\hat{p}(\text{fake  real}) = 0.01 \)
The question remains: Are these acceptable misclassification rates?
A decision should be made in advance as to what would be the acceptable levels of error. Here again, you need to think about the consequences of making a mistake. In terms of classifying a genuine note as a counterfeit, one might put an innocent person in jail. If you make the opposite error you might let a criminal go free. What are the costs of these types of errors? And, are the above error rates acceptable? This decision should be made in advance. You should have some prior notion of what you would consider reasonable.