Exploratory Analysis - 6

Backward Elimination & Stepwise Selection Procedures

We will begin here by using two subset selection procedures in SAS Proc Logistic for choosing variables related to the response:

  1. Backward elimination
  2. Stepwise selection

Which Model Should I Fit?

Take a look at this SAS program (water_level3.sas):

SAS program

The data are input, the variables identified and then the PROC LOGISTIC procedure is called specifying a model where Y (subjects passed, 1 or failed, 0) is the response. Notice, highlighted in purple, the use of the word 'backward' and 'stepwise' to specify the two different subset selection procedures.

Backward Elimination

In the output, the procedure begins by entering all of the variables:

SAS output

and then one by one the variables are removed...

sas output

sas output

Each time the model is re-fit until at the end of the procedure the note below is reported along with the four variables that were removed from the model fit.

sas output

Directly after this the procedure lists the variables that are retained in the model as their p-values and all < 0.05:

sas output

along with the coefficients that make up the fitted model.

Stepwise Selection

This procedure takes the opposite approach beginning with one variable and subsequently adding additional variables, on at a time, to the model, fitting it each time.

sas output

sas output

until at the end of the procedure the following note is given:

sas output

and a summary list of the variables that remain in the model is displayed:

sas output

Odds Ratio Estimates

If we look at the Odds Ratio Estimates for both procedures:

Backward Elimination

sas output

Stepwise Selection

sas output

The two procedures each selected 6 variables with 5 in common; backward elimination chose ‘gravity’ while stepwise chose ‘totphysics’. The odds ratio and confidence interval estimates are quite close for all variables.

Furthermore, neither model includes the variable ‘sex’. We conclude that adjusted for these 6 independent variables ‘sex’ does not affect passing/failing.

This handout covers this information as well: WaterStudyModelSelection.pdf