15.2 - Three Tests for Rho

The hypothesis test for the slope \(\beta\) that we developed on the previous page was developed under the assumption that a response \(Y\) is a linear function of a nonrandom predictor \(x\). This situation occurs when the researcher has complete control of the values of the variable \(x\). For example, a researcher might be interested in modeling the linear relationship between the temperature \(x\) of an oven and the moistness \(y\) of chocolate chip muffins. In this case, the researcher sets the oven temperatures (in degrees Fahrenheit) to 350, 360, 370, and so on, and then observes the values of the random variable \(Y\), that is, the moistness of the baked muffins. In this case, the linear model:

\(Y_i=\alpha+\beta x_i+\epsilon_i\)

implies that the average moistness:

\(E(Y)=\alpha+\beta x\)

is a linear function of the temperature setting.

There are other situations, however, in which the variable \(x\) is not nonrandom (yes, that's a double negative!), but rather an observed value of a random variable \(X\). For example, a fisheries researcher may want to relate the age \(Y\) of a sardine to its length \(X\). If a linear relationship could be established, then in the future fisheries researchers could predict the age of a sardine simply by measuring its length. In this case, the linear model:

\(Y_i=\alpha+\beta x_i+\epsilon_i\)

implies that the average age of a sardine, given its length is \(X=x\):

\(E(Y|X=x)=\alpha+\beta x\)

is a linear function of the length. That is, the conditional mean of \(Y\) given \(X=x\) is a linear function. Now, in this second situation, in which both \(X\) and \(Y\) are deemed random, we typically assume that the pairs \((X_1, Y_1), (X_2, Y_2), \ldots, (X_n, Y_n)\) are a random sample from a bivariate normal distribution with means \(\mu_X\) and \(\mu_Y\), variances \(\sigma^2_X\) and \(\sigma^2_Y\), and correlation coefficient \(\rho\). If that's the case, it can be shown that the conditional mean:

\(E(Y|X=x)=\alpha+\beta x\)

must be of the form:

\(E(Y|X=x)=\left(\mu_Y-\rho \dfrac{\sigma_Y}{\sigma_X} \mu_X\right)+\left(\rho \dfrac{\sigma_Y}{\sigma_X}\right)x\)

That is:

\(\beta=\rho \dfrac{\sigma_Y}{\sigma_X}\)

Now, for the case where \((X_i, Y_i)\) has a bivariate distribution, the researcher may not necessarily be interested in estimating the linear function:

\(E(Y|X=x)=\alpha+\beta x\)

but rather simply knowing whether \(X\) and \(Y\) are independent. In STAT 414, we've learned that if \((X_i, Y_i)\) follows a bivariate normal distribution, then testing for the independence of \(X\) and \(Y\) is equivalent to testing whether the correlation coefficient \(\rho\) equals 0. We'll now work on developing three different hypothesis tests for testing \(H_0:\rho=0\) assuming \((X_i, Y_i)\) follows a bivariate normal distribution.

A T-Test for Rho

Given our wordy prelude above, this test may be the simplest of all of the tests to develop. That's because we argued above that if \((X_i, Y_i)\) follows a bivariate normal distribution, and the conditional mean is a linear function:

\(E(Y|X=x)=\alpha+\beta x\)

then:

\(\beta=\rho \dfrac{\sigma_Y}{\sigma_X}\)

That suggests, therefore, that testing for \(H_0:\rho=0\) against any of the alternative hypotheses \(H_A:\rho\neq 0\), \(H_A:\rho> 0\) and \(H_A:\rho< 0\) is equivalent to testing \(H_0:\beta=0\) against the corresponding alternative hypothesis \(H_A:\beta\neq 0\), \(H_A:\beta<0\) and \(H_A:\beta>0\). That is, we can simply compare the test statistic:

\(t=\dfrac{\hat{\beta}-0}{\sqrt{MSE/\sum(x_i-\bar{x})^2}}\)

to a \(t\) distribution with \(n-2\) degrees of freedom. It should be noted, though, that the test statistic can be instead written as a function of the sample correlation coefficient:

\(R=\dfrac{\dfrac{1}{n-1} \sum\limits_{i=1}^n (X_i-\bar{X}) (Y_i-\bar{Y})}{\sqrt{\dfrac{1}{n-1} \sum\limits_{i=1}^n (X_i-\bar{X})^2} \sqrt{\dfrac{1}{n-1} \sum\limits_{i=1}^n (Y_i-\bar{Y})^2}}=\dfrac{S_{xy}}{S_x S_y}\)

That is, the test statistic can be alternatively written as:

\(t=\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\)

and because of its algebraic equivalence to the first test statistic, it too follows a \(t\) distribution with \(n-2\) degrees of freedom. Huh? How are the two test statistics algebraically equivalent? Well, if the following two statements are true:

\(\hat{\beta}=\dfrac{\dfrac{1}{n-1} \sum\limits_{i=1}^n (X_i-\bar{X}) (Y_i-\bar{Y})}{\dfrac{1}{n-1} \sum\limits_{i=1}^n (X_i-\bar{X})^2}=\dfrac{S_{xy}}{S_x^2}=R\dfrac{S_y}{S_x}\)
\(MSE=\dfrac{\sum\limits_{i=1}^n(Y_i-\hat{Y}_i)^2}{n-2}=\dfrac{\sum\limits_{i=1}^n\left[Y_i-\left(\bar{Y}+\dfrac{S_{xy}}{S_x^2} (X_i-\bar{X})\right) \right]^2}{n-2}=\dfrac{(n-1)S_Y^2 (1-R^2)}{n-2}\)

then simple algebra illustrates that the two test statistics are indeed algebraically equivalent:

\(\displaystyle{t=\frac{\hat{\beta}}{\sqrt{\frac{MSE}{\sum (x_i-\bar{x})^2}}} =\frac{r\left(\frac{S_y}{S_x}\right)}{\sqrt{\frac{(n-1)S^2_y(1-r^2)}{(n-2)(n-1)S^2_x}}}=\frac{r\sqrt{n-2}}{\sqrt{1-r^2}}} \)

Now, for the veracity of those two statements? Well, they are indeed true. The first one requires just some simple algebra. The second one requires a bit of trickier algebra that you'll soon be asked to work through for homework.

An R-Test for Rho

It would be nice to use the sample correlation coefficient \(R\) as a test statistic to test more general hypotheses about the population correlation coefficient:

\(H_0:\rho=\rho_0\)

but the probability distribution of \(R\) is difficult to obtain. It turns out though that we can derive a hypothesis test using just \(R\) provided that we are interested in testing the more specific null hypothesis that \(X\) and \(Y\) are independent, that is, for testing \(H_0:\rho=0\).

Theorem

Provided that \(\rho=0\), the probability density function of the sample correlation coefficient \(R\) is:

\(g(r)=\dfrac{\Gamma[(n-1)/2]}{\Gamma(1/2)\Gamma[(n-2)/2]}(1-r^2)^{(n-4)/2}\)

over the support \(-1<r<1\).

Proof

We'll use the distribution function technique, in which we first find the cumulative distribution function \(G(r)\), and then differentiate it to get the desired probability density function \(g(r)\). The cumulative distribution function is:

\(G(r)=P(R \leq r)=P \left(\dfrac{R\sqrt{n-2}}{\sqrt{1-R^2}}\leq \dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right)=P\left(T \leq \dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right)\)

The first equality is just the definition of the cumulative distribution function, while the second and third equalities come from the definition of the \(T\) statistic as a function of the sample correlation coefficient \(R\). Now, using what we know of the p.d.f. \(h(t)\) of a \(T\) random variable with \(n-2\) degrees of freedom, we get:

\(G(r)=\int^{\frac{r\sqrt{n-2}}{\sqrt{1-r^2}}}_{-\infty} h(t)dt=\int^{\frac{r\sqrt{n-2}}{\sqrt{1-r^2}}}_{-\infty} \dfrac{\Gamma[(n-1)/2]}{\Gamma(1/2)\Gamma[(n-2)/2]} \dfrac{1}{\sqrt{n-2}}\left(1+\dfrac{t^2}{n-2}\right)^{-\frac{(n-1)}{2}} dt\)

Now, it's just a matter of taking the derivative of the c.d.f. \(G(r)\) to get the p.d.f. \(g(r)\)). Using the Fundamental Theorem of Calculus, in conjunction with the chain rule, we get:

\(g(r)=h\left(\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right) \dfrac{d}{dr}\left(\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right)\)

Focusing first on the derivative part of that equation, using the quotient rule, we get:

\(\dfrac{d}{dr}\left[\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right]=\dfrac{(1-r^2)^{1/2} \cdot \sqrt{n-2}-r\sqrt{n-2}\cdot \frac{1}{2}(1-r^2)^{-1/2} \cdot -2r }{(\sqrt{1-r^2})^2}\)

Simplifying, we get:

\(\dfrac{d}{dr}\left[\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right]=\sqrt{n-2}\left[ \dfrac{(1-r^2)^{1/2}+r^2 (1-r^2)^{-1/2} }{1-r^2} \right]\)

Now, if we multiply by 1 in a special way, that is, this way:

\(\dfrac{d}{dr}\left[\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right]=\sqrt{n-2}\left[ \dfrac{(1-r^2)^{1/2}+r^2 (1-r^2)^{-1/2} }{1-r^2} \right]\left(\frac{(1-r^2)^{1/2}}{(1-r^2)^{1/2}}\right) \)

and then simplify, we get:

\(\dfrac{d}{dr}\left[\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right]=\sqrt{n-2}\left[ \dfrac{1-r^2+r^2 }{(1-r^2)^{3/2}} \right]=\sqrt{n-2}(1-r^2)^{-3/2}\)

Now, looking back at \(g(r)\), let's work on the \(h(.)\) part. Replacing the function in the one place where a t appears in the p.d.f. of a \(T\) random variable with \(n-2\) degrees of freedom, we get:

\( h\left(\frac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right)= \frac{\Gamma\left(\frac{n-1}{2}\right)}{\Gamma\left(\frac{1}{2}\right)\Gamma\left(\frac{n-2}{2}\right)}\left(\frac{1}{\sqrt{n-2}}\right)\left[1+\frac{\left(\frac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right)^2}{n-2} \right]^{-\frac{n-1}{2}} \)

Canceling a few things out we get:

\(h\left(\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right)=\dfrac{\Gamma[(n-1)/2]}{\Gamma(1/2)\Gamma[(n-2)/2]}\cdot \dfrac{1}{\sqrt{n-2}}\left(1+\dfrac{r^2}{1-r^2}\right)^{-\frac{(n-1)}{2}}\)

Now, because:

\(\left(1+\dfrac{r^2}{1-r^2}\right)^{-\frac{(n-1)}{2}}=\left(\dfrac{1-r^2+r^2}{1-r^2}\right)^{-\frac{(n-1)}{2}}=\left(\dfrac{1}{1-r^2}\right)^{-\frac{(n-1)}{2}}=(1-r^2)^{\frac{(n-1)}{2}}\)

we finally get:

\(h\left(\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}\right)=\dfrac{\Gamma[(n-1)/2]}{\Gamma(1/2)\Gamma[(n-2)/2]}\cdot \dfrac{1}{\sqrt{n-2}}(1-r^2)^{\frac{(n-1)}{2}}\)

We're almost there! We just need to multiply the two parts together. Doing so, we get:

\(g(r)=\left[\frac{\Gamma\left(\frac{n-1}{2}\right)}{\Gamma\left(\frac{1}{2}\right)\Gamma\left(\frac{n-2}{2}\right)}\left(\frac{1}{\sqrt{n-2}}\right)(1-r^2)^{\frac{n-1}{2}}\right]\left[\sqrt{n-2}(1-r^2)^{-3/2}\right]\)

which simplifies to:

\(g(r)=\dfrac{\Gamma[(n-1)/2]}{\Gamma(1/2)\Gamma[(n-2)/2]}(1-r^2)^{(n-4)/2}\)

over the support \(-1<r<1\), as was to be proved.

Now that we know the p.d.f. of \(R\), testing \(H_0:\rho=0\) against any of the possible alternative hypotheses just involves integrating \(g(r)\) to find the critical value(s) to ensure that \(\alpha\), the probability of a Type I error is small. For example, to test \(H_0:\rho=0\) against the alternative \(H_A:\rho>0\), we find the value \(r_\alpha(n-2)\) such that:

\(P(R \geq r_\alpha(n-2))=\int_{r_\alpha(n-2)}^1 \dfrac{\Gamma[(n-1)/2]}{\Gamma(1/2)\Gamma[(n-2)/2]}(1-r^2)^{\frac{(n-4)}{2}}dr=\alpha\)

Yikes! Do you have any interest in integrating that function? Well, me neither! That's why we'll instead use an \(R\) Table, such as the one we have in Table IX at the back of our textbook.

An Approximate Z-Test for Rho

Okay, the derivation for this hypothesis test is going to be MUCH easier than the derivation for that last one. That's because we aren't going to derive it at all! We are going to simply state, without proof, the following theorem.

Theorem

The statistic:

\(W=\dfrac{1}{2}\ln\dfrac{1+R}{1-R}\)

follows an approximate normal distribution with mean \(E(W)=\dfrac{1}{2}\ln\dfrac{1+\rho}{1-\rho}\) and variance \(Var(W)=\dfrac{1}{n-3}\).

The theorem, therefore, allows us to test the general null hypothesis \(H_0:\rho=\rho_0\) against any of the possible alternative hypotheses comparing the test statistic:

\(Z=\dfrac{\dfrac{1}{2}ln\dfrac{1+R}{1-R}-\dfrac{1}{2}ln\dfrac{1+\rho_0}{1-\rho_0}}{\sqrt{\dfrac{1}{n-3}}}\)

to a standard normal \(N(0,1)\) distribution.

What? We've looked at no examples yet on this page? Let's take care of that by closing with an example that utilizes each of the three hypothesis tests we derived above.

Example 15-2

An admissions counselor at a large public university was interested in learning whether freshmen calculus grades are independent of high school math achievement test scores. The sample correlation coefficient between the mathematics achievement test scores and calculus grades for a random sample of \(n=10\) college freshmen was deemed to be 0.84.

Does this observed sample correlation coefficient suggest, at the \(\alpha=0.05\) level, that the population of freshmen calculus grades are independent of the population of high school math achievement test scores?

Answer

The admissions counselor is interested in testing:

\(H_0:\rho=0\) against \(H_A:\rho \neq 0\)

Using the \(t\)-statistic we derived, we get:

\(t=\dfrac{r\sqrt{n-2}}{\sqrt{1-r^2}}=\dfrac{0.84\sqrt{8}}{\sqrt{1-0.84^2}}=4.38\)

We reject the null hypothesis if the test statistic is greater than 2.306 or less than −2.306.

Because \(t=4.38>2.306\), we reject the null hypothesis in favor of the alternative hypothesis. There is sufficient evidence at the 0.05 level to conclude that the population of freshmen calculus grades are not independent of the population of high school math achievement test scores.

Using the R-statistic, with 8 degrees of freedom, Table IX in the back of the book tells us to reject the null hypothesis if the absolute value of \(R\) is greater than 0.6319. Because our observed \(r=0.84>0.6319\), we again reject the null hypothesis in favor of the alternative hypothesis. There is sufficient evidence at the 0.05 level to conclude that freshmen calculus grades are not independent of high school math achievement test scores.

Using the approximate Z-statistic, we get:

\(z=\dfrac{\dfrac{1}{2}ln\left(\dfrac{1+0.84}{1-0.84}\right)-\dfrac{1}{2}ln\left(\dfrac{1+0}{1-0}\right)}{\sqrt{1/7}}=3.23\)

In this case, we reject the null hypothesis if the absolute value of \(Z\) were greater than 1.96. It clearly is, and so we again reject the null hypothesis in favor of the alternative hypothesis. There is sufficient evidence at the 0.05 level to conclude that freshmen calculus grades are not independent of high school math achievement test scores.

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility