R codes for Tree Based Algorithms

TXTrain1$TxClassLib <- as.factor(TXTrain1$TxClassLib)

TXTrainTree1 <- tree(TxClassLib ~ ., data=TXTrain1, method="class")

plot(TXTrainTree1, col="dark red")

text(TXTrainTree1, pretty=0, cex=0.6, col="dark red")

mtext("Decision Tree (Unpruned) for Training Set 1", side=3, line = 2, cex=0.8, col="dark red")

m <- misclass.tree(TXTrainTree1)

propmisTrain1 <- m / length(TXTrainTree1$y)

cat("Proportion of Misclassification in Training Set 1:", propmisTrain1)

TXTest1Treefit1 <- predict(TXTrainTree1, TXTest1, type="class")

Tab1 <- table(TXTest1Treefit1, TXTest1$TxClassLib)

propmisTest1 <- 1-tr(Tab1)/length(TXTest1Treefit1)

cat("Proportion of Misclassification in Test Set 1 =", propmisTest1)

TXTrainPruneTree1 <- prune.misclass(TXTrainTree1, best=20)

m <- misclass.tree(TXTrainPruneTree1)

m / length(TXTrainPruneTree1$y)

plot(TXTrainPruneTree1, col="dark red")

text(TXTrainPruneTree1, pretty=0, cex=0.6, col="dark red")

mtext("Decision Tree for Training Set 1", side=3, line = 2, cex=0.8, col="dark red")

TXTest1PruneTreefit1 <- predict(TXTrainPruneTree1, TXTest1, type="class")

Tab1 <- table(TXTest1PruneTreefit1, TXTest1$TxClassLib)

propmisTest1 <- 1-tr(Tab1)/length(TXTest1PruneTreefit1)

cat("Proportion of Misclassification in Test Set 1 =", propmisTest1)

################### Random Forest ###################

TXTrainRF1 <- randomForest(TXTrain1[,1:8],TXTrain1[,9], ntree=100, importance=T, proximity=T)

# TXTrainRF1 <- randomForest(TXTrain1[,1:8],TXTrain1[,9], xtest=TXTest1[,1:8], ytest=TXTest1[,9], ntree=100, importance=T, proximity=T)

plot(TXTrainRF1, main="OOB Error Rate: Set 1", cex=0.4)

TXTrainRF1

varImpPlot(TXTrainRF1, pch=19, col="dark red", main="Variable Importance: Set 1", cex=0.8)

Unsupervised tree algorithm is applied to all Training sets and misclassification probability was calculated for both the Training and Test sets. All the Training Sets give rise to very similar decision trees. Three representative trees are shown below as examples.

Following table summarizes the misclassification probabilities for Tree classification

Therefore overall mis-classification probability of the 10-fold cross-validation is 17.9%, which is the mean mis-classification probability of the Test sets.

Pruning was tried for this decision tree, but it did not improve the result.

However, it is to be noted that LDA takes into account linear combinations of the predictors, whereas Tree always divides the sample space into splits parallel to the axes. If separation is along any other line, Tree wil not be able to capture that. This is exactly what is happening here.