Lesson 4: SLR Model Assumptions

Overview Section

How do we evaluate a model? How do we know if the model we are using is good? One way to consider these questions is to assess whether the assumptions underlying the simple linear regression model seem reasonable when applied to the dataset in question. Since the assumptions relate to the (population) prediction errors, we do this through the study of the (sample) estimated errors, the residuals.

We focus in this lesson on graphical residual analysis. When we revisit this topic in the context of multiple linear regression in Lesson 7 we'll also study some statistical tests for assessing the assumptions. We'll consider various remedies for when linear regression model assumptions fail throughout the rest of the course, particularly in Lesson 9.


Upon completion of this lesson, you should be able to:

  • Understand why we need to check the assumptions of our model.
  • Know the things that can go wrong with the linear regression model.
  • Know how we can detect various problems with the model using a residuals vs. fits plot.
  • Know how we can detect various problems with the model using residuals vs. predictor plots.
  • Know how we can detect a certain kind of dependent error terms using residuals vs. order plots.
  • Know how we can detect non-normal error terms using a normal probability plot.

Lesson 4 Code Files Section

Below is a zip file that contains all the data sets used in this lesson:


  • adaptive.txt
  • adaptive.txt
  • alcoholarm.txt
  • alcoholtobacco.txt
  • alligator.txt
  • alphapluto.txt
  • anscombe.txt
  • bloodpress.txt
  • bluegills.txt
  • carstopping.txt
  • corrosion.txt
  • handheight.txt
  • incomebirth.txt
  • realestate_sales.txt
  • residuals.txt
  • skincancer.txt
  • solutions_conc.txt
  • treadmill.txt
  • treadwear.txt