11.8.4 - Related Methods for Decision Trees

Printer-friendly versionPrinter-friendly version

Random Forests

Leo Breiman developed an extension of decision trees called random forests. There is publicly available software for this method. Here is a good place to start:

https://en.wikipedia.org/wiki/Random_forest

Random forests train multiple trees. To obtain a single tree, when splitting a node, only a randomly chosen subset of features are considered for thresholding. Leo Breiman did extensive experiments using random forests and compared it with support vector machines. He found that overall random forests seem to be slightly better. Moreover, random forests comes with many other advantages.

Other extensions

The candidate questions in decision trees are about whether a variable is greater or smaller than a given value. There are some extensions to more complicated questions, or splitting methods, for instance, performing LDA at every node, but the original decision tree method seems to stay as the most popular and there is no strong evidence that the extensions are considerably better in general.