Lesson 15: Tests Concerning Regression and Correlation

Overview Section

In lessons 35 and 36, we learned how to calculate point and interval estimates of the intercept and slope parameters, \(\alpha\) and \(\beta\), of a simple linear regression model:

\(Y_i=\alpha+\beta(x_i-\bar{x})+\epsilon_i\)

with the random errors \(\epsilon_i\) following a normal distribution with mean 0 and variance \(\sigma^2\). In this lesson, we'll learn how to conduct a hypothesis test for testing the null hypothesis that the slope parameter equals some value, \(\beta_0\), say. Specifically, we'll learn how to test the null hypothesis \(H_0:\beta=\beta_0\) using a \(t\)-statistic.

Now, perhaps it is not a point that has been emphasized yet, but if you take a look at the form of the simple linear regression model, you'll notice that the response \(Y\)'s are denoted using a capital letter, while the predictor \(x\)'s are denoted using a lowercase letter. That's because, in the simple linear regression setting, we view the predictors as fixed values, whereas we view the responses as random variables whose possible values depend on the population \(x\) from which they came. Suppose instead that we had a situation in which we thought of the pair \((X_i, Y_i)\) as being a random sample, \(i=1, 2, \ldots, n\), from a bivariate normal distribution with parameters \(\mu_X\), \(\mu_Y\), \(\sigma^2_X\), \(\sigma^2_Y\) and \(\rho\). Then, we might be interested in testing the null hypothesis \(H_0:\rho=0\), because we know that if the correlation coefficient is 0, then \(X\) and \(Y\) are independent random variables. For this reason, we'll learn, not one, but three (!) possible hypothesis tests for testing the null hypothesis that the correlation coefficient is 0. Then, because we haven't yet derived an interval estimate for the correlation coefficient, we'll also take the time to derive an approximate confidence interval for \(\rho\).