In this section, we present the test for the population correlation using a test statistic based on the sample correlation.
As with all hypothesis test, there are underlying assumptions. The assumptions for the test for correlation are:
- The are no outliers in either of the two quantitative variables.
- The two variables should follow a normal distribution
If there is no linear relationship in the population, then the population correlation would be equal to zero.
\(H_0\colon \rho=0\) (\(X\) and \(Y\) are linearly independent, or X and Y have no linear relationship)
\(H_a\colon \rho\ne0\) (\(X\) and \(Y\) are linearly dependent)
Research Question |
Is there a linear relationship? |
Is there a positive linear relationship? |
Is there a negative linear relationship? |
---|---|---|---|
Null Hypothesis |
\(\rho=0\) |
\(\rho=0\) |
\(\rho=0\) |
Alternative Hypothesis |
\(\rho\ne0\) |
\(\rho>0\) |
\(\rho<0\) |
Type of Test |
Two-tailed, non-directional |
Right-tailed, directional |
Left-tailed, directional |
Under the null hypothesis and with above assumptions, the test statistic, \(t^*\), found by:
\(t^*=\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\)
which follows a \(t\)-distribution with \(n-2\) degrees of freedom.
As mentioned before, we will use Minitab for the calculations. The output from Minitab previously used to find the sample correlation also provides a p-value. This p-value is for the two-sided test. If the alternative is one-sided, the p-value from the output needs to be adjusted.
Example 9-7: Student height and weight (Tests for \(\rho\)) Section
For the height and weight example (university_ht_wt.TXT), conduct a test for correlation with a significance level of 5%.
The output from Minitab is:
Correlation: height, weight
Correlations
P-value
0.000
For the sake of this example, we will find the test statistic and the p-value rather than just using the Minitab output. There are 28 observations.
The test statistic is:
\begin{align} t^*&=\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\\&=\dfrac{(0.711)\sqrt{28-2}}{\sqrt{1-0.711^2}}\\&=5.1556 \end{align}
Next, we need to find the p-value. The p-value for the two-sided test is:
\(\text{p-value}=2P(T>5.1556)<0.0001\)
Therefore, for any reasonable \(\alpha\) level, we can reject the hypothesis that the population correlation coefficient is 0 and conclude that it is nonzero. There is evidence at the 5% level that Height and Weight are linearly dependent.
Try it! Section
For the sales and advertising example, conduct a test for correlation with a significance level of 5% with Minitab.
Sales units are in thousands of dollars, and advertising units are in hundreds of dollars.
Sales (Y) | Advertising (X) |
---|---|
1 | 1 |
1 | 2 |
2 | 3 |
2 | 4 |
4 | 5 |
Correlation: Y,X
Correlations
P-value
0.035
The sample correlation is 0.904. This value indicates a strong positive linear relationship between sales and advertising.
For the Sales (Y) and Advertising (X) data, the test statistic is...
\(t^*=\dfrac{(0.904)\sqrt{5-2}}{\sqrt{1-(0.904)^2}}=3.66\)
...with df of 3, we arrive at a p-value = 0.035. For \(\alpha=0.05\), we can reject the hypothesis that the population correlation coefficient is 0 and conclude that it is nonzero, i.e., conclude that sales and advertising are linearly dependent.