Lesson 4: SLR Model Assumptions

Overview Section

How do we evaluate a model? How do we know if the model we are using is good? One way to consider these questions is to assess whether the assumptions underlying the simple linear regression model seem reasonable when applied to the dataset in question. Since the assumptions relate to the (population) prediction errors, we do this through the study of the (sample) estimated errors, the residuals.

We focus in this lesson on graphical residual analysis. When we revisit this topic in the context of multiple linear regression in Lesson 7 we'll also study some statistical tests for assessing the assumptions. We'll consider various remedies for when linear regression model assumptions fail throughout the rest of the course, particularly in Lesson 9.

Objectives

Upon completion of this lesson, you should be able to:

Understand why we need to check the assumptions of our model.
Know the things that can go wrong with the linear regression model.
Know how we can detect various problems with the model using a residuals vs. fits plot.
Know how we can detect various problems with the model using residuals vs. predictor plots.
Know how we can detect a certain kind of dependent error terms using residuals vs. order plots.
Know how we can detect non-normal error terms using a normal probability plot.

Lesson 4 Code Files Section

Below is a zip file that contains all the data sets used in this lesson:

STAT501_Lesson04.zip

adaptive.txt
adaptive.txt
alcoholarm.txt
alcoholtobacco.txt
alligator.txt
alphapluto.txt
anscombe.txt
bloodpress.txt
bluegills.txt
carstopping.txt
corrosion.txt
handheight.txt
incomebirth.txt
realestate_sales.txt
residuals.txt
skincancer.txt
solutions_conc.txt
treadmill.txt
treadwear.txt