T.1.2 - Resistant Regression Methods

The next method we discuss is often used interchangeably with robust regression methods. However, there is a subtle difference between the two methods that are not usually outlined in the literature. Whereas robust regression methods attempt to only dampen the influence of outlying cases, resistant regression methods use estimates that are not influenced by any outliers (this comes from the definition of resistant statistics, which are measures of the data that are not influenced by outliers, such as the median). This is best accomplished by trimming the data, which "trims" extreme values from either end (or both ends) of the range of data values.

There is also one other relevant term when discussing resistant regression methods. Suppose we have a data set \(x_{1},x_{2},\ldots,x_{n}\). The order statistics are simply defined to be the data values arranged in increasing order and are written as \(x_{(1)},x_{(2)},\ldots,x_{(n)}\). Therefore, the minimum and maximum of this data set are \(x_{(1)}\) and \(x_{(n)}\), respectively. As we will see, the resistant regression estimators provided here are all based on the ordered residuals.

We present three commonly used resistant regression methods:

Least Quantile of Squares

The least quantile of squares method minimizes the squared order residual (presumably selected as it is most representative of where the data is expected to lie) and is formally defined by \(\begin{equation*} \hat{\beta}_{\textrm{LQS}}=\arg\min_{\beta}\epsilon_{(\nu)}^{2}(\beta), \end{equation*}\) where \(\nu=P*n\) is the \(P^{\textrm{th}}\) percentile (i.e., \(0<P\leq 1\)) of the empirical data (if \(\nu\) is not an integer, then specify \(\nu\) to be either the next greatest or lowest integer value). A specific case of the least quantile of squares method where p = 0.5 (i.e., the median) and is called the least median of squares method (and the estimate is often written as \(\hat{\beta}_{\textrm{LMS}}\)).

Least Trimmed Sum of Squares

The least trimmed sum of squares method minimizes the sum of the \(h\) smallest squared residuals and is formally defined by \(\begin{equation*} \hat{\beta}_{\textrm{LTS}}=\arg\min_{\beta}\sum_{i=1}^{h}\epsilon_{(i)}^{2}(\beta), \end{equation*}\) where \(h\leq n\). If h = n, then you just obtain \(\hat{\beta}_{\textrm{OLS}}\).

Least Trimmed Sum of Absolute Deviations

The least trimmed sum of absolute deviations method minimizes the sum of the h smallest absolute residuals and is formally defined by \(\begin{equation*} \hat{\beta}_{\textrm{LTA}}=\arg\min_{\beta}\sum_{i=1}^{h}|\epsilon(\beta)|_{(i)}, \end{equation*}\) where again \(h\leq n\). If h = n, then you just obtain \(\hat{\beta}_{\textrm{LAD}}\).

So, which method from robust or resistant regressions do we use? In order to guide you in the decision-making process, you will want to consider both the theoretical benefits of a certain method as well as the type of data you have. The theoretical aspects of these methods that are often cited include their breakdown values and overall efficiency. Breakdown values are a measure of the proportion of contamination (due to outlying observations) that an estimation method can withstand and still maintain to be robust against the outliers. Efficiency is a measure of an estimator's variance relative to another estimator (when it is the smallest it can possibly be, then the estimator is said to be "best"). For example, the least quantile of squares method and least trimmed sum of squares method both have the same maximal breakdown value for certain P, the least median of squares method is of low efficiency, and the least trimmed sum of squares method has the same efficiency (asymptotically) as certain M-estimators. As for your data, if there appear to be many outliers, then a method with a high breakdown value should be used. A preferred solution is to calculate many of these estimates for your data and compare their overall fits, but this will likely be computationally expensive.