10: Inference About Regression

Printer-friendly versionPrinter-friendly version

Introduction

Let's get started! Here is what you will learn in this lesson.

Learning objectives for this lesson

Upon completion of this lesson, you should be able to do the following:

  • Understand the relationship between the slope of the regression line and correlation,
  • Comprehend the meaning of the Coefficient of Determination, R2,
  • Know how to determine which variable is a response and which is an explanatory in a regression equation,
  • Understand that correlation measures the strength of a linear relationship between two variables,
  • Realize how outliers can influence a regression equation, and
  • Interpret the test results of simple regression analyses.

Examining Relationships Between Two Variables

Previously we considered the distribution of a single quantitative variable. Now we will study the relationship between two quantitative variables. SPECIAL NOTE: For our purposes we are going to consider the case where both variables are quantitative. However, in a regression analysis we can use categorical variables (e.g. Gender, Class Standing) as a response i.e. predictor. When we consider the relationship between two variables, there are three possibilities:

  1. Both variables are categorical. We analyze an association through a comparison of conditional probabilities and graphically represent the data using contingency tables. Examples of categorical variables are gender and class standing.
  2. Both variables are quantitative. To analyze this situation we consider how one variable, called a response variable, changes in relation to changes in the other variable called an explanatory variable. Graphically we use scatterplots to display two quantitative variables. Examples are age, height, weight (i.e. things that are measured).
  3. One variable is categorical and the other is quantitative, for instance height and gender. These are best compared by using side-by-side boxplots to display any differences or similarities in the center and variability of the quantitative variable (e.g. height) across the categories (e.g. Male and Female).