Printer-friendly versionPrinter-friendly version

We are interested in answering the following questions for the Coronary Heart Disease example from the introduction: Is having a coronary heart disease independent of the cholesterol level in ones body? Is there any evidence of a relationship/association between cholesterol and heart disease?

SAS logoTest of Independence in SAS - Coronary Heart Disease Example

Let's see the same calculation using the SAS code below:

SAS program

The SAS output can be found in HeartDisease.lst

Here is a portion of the output from SAS with the Pearson chi-square statistic and Deviance (likelihood-ratio chi-square) statistic:

SAS output

R logoTest of Independence in R - Coronary Heart Disease Example

Two different computations are done in HeartDisease.R file using the function chisq.test(). Here is the first:

heart disease R code

You will notice in the file that unlike for 2 × 2 tables where we had to worry about R using the Yates' continuity correction, there is no such thing for for I × J tables. It doesn't matter, in this example, if you call a function chisq.test(heart, correct=TRUE) or chisq.test(heart, correct=FALSE) because you get the same result. There are clearly other ways to code this in R. You can refer to other examples on the R programs page. 

From the output file HeartDisease.out we can see that the Pearson’s Chi-squared statistic is,

X2= 35.0285, df = 3, p-value = 1.202e-07

and the Likelihood-ratio test statistic is

G2= 31.9212, df = 3, p-value = 5.43736e-07

Notice that the chisq.test() function does not compute G2, but we wrote extra code to do that. For the complete output file, please run HeartDisease.R file. Here is also a more general function LRstats.R for computing the G2 for two-way tables. The results are discussed further below.

Conclusion: We reject the null hypothesis of independence because of the big values of the chi-square statistics. Notice the degrees of freedom are equal to 3 = (4-1)(2-1), and thus the p-value is very low. Therefore, through the X2 test for independence, we have demonstrated beyond a reasonable doubt that a relationship exists between cholesterol and CHD.

A good statistical analysis, however, should not end with the rejection of a null hypothesis. Once we have demonstrated that a relationship exists, we should describe that relationship. To do this we consider computing and evaluating the (i) residuals, (ii) measures of association, and (iii) partitioning chi-square.