7.3 - MLR Model Assumptions

The four conditions ("LINE") that comprise the multiple linear regression model generalize the simple linear regression model conditions to take account of the fact that we now have multiple predictors:

Linear Function: The mean of the response, \(\mbox{E}(Y_i)\), at each set of values of the predictors, \((x_{1i}, x_{2i},...)\), is a Linear function of the predictors.
Independent: The errors, \( \epsilon_{i}\), are Independent.
Normally Distributed: The errors, \( \epsilon_{i}\), at each set of values of the predictors, \((x_{1i}, x_{2i},...)\), are Normally distributed.
Equal variances (denoted \(\sigma^{2}\)): The errors, \( \epsilon_{i}\), at each set of values of the predictors, \((x_{1i}, x_{2i},...)\), have Equal variances (denoted \(\sigma^{2}\)).

An equivalent way to think of the first (linearity) condition is that the mean of the error, \(\epsilon_i\), at each set of values of the predictors, \((x_{1i},x_{2i},\dots)\), is zero. An alternative way to describe all four assumptions is that the errors, \(\epsilon_i\), are independent normal random variables with mean zero and constant variance, \(\sigma^2\).

As in simple linear regression, we can assess whether these conditions seem to hold for a multiple linear regression model applied to a particular sample dataset by looking at the estimated errors, i.e., the residuals, \(e_i = y_i-\hat{y}_i\).