# 3.2.1 - Implementing the Analysis in R and SAS

We are interested in answering the following questions for the Coronary Heart Disease example from the introduction: Is having a coronary heart disease independent of the cholesterol level in ones body? Is there any evidence of a relationship/association between cholesterol and heart disease?

#### Test of Independence in SAS - Coronary Heart Disease Example

Let's see the same calculation using the SAS code below: HeartDisease.sas:

The SAS output can be found in HeartDisease.lst

Here is a portion of the output from SAS with the Pearson chi-square statistic and Deviance (likelihood-ratio chi-square) statistic:

#### Test of Independence in R - Coronary Heart Disease Example

Two different computations are done in HeartDisease.R file using the function **chisq.test()**. Here is the first:

You will notice in the file that unlike for 2 × 2 tables where we had to worry about R using the Yates' continuity correction, there is no such thing for for *I* × *J* tables. It doesn't matter, in this example, if you call a function chisq.test(heart, correct=TRUE) or chisq.test(heart, correct=FALSE) because you get the same result. There are clearly other ways to code this in R. You can refer to other examples on the R programs page.

From the output file HeartDisease.out we can see that the Pearson’s Chi-squared statistic is,

*X ^{2}*= 35.0285,

*df*= 3,

*p*-value = 1.202e-07

and the Likelihood-ratio test statistic is

*G ^{2}*= 31.9212,

*df*= 3,

*p*-value = 5.43736e-07

Notice that the chisq.test() function does not compute *G ^{2}*, but we wrote extra code to do that. For the complete output file, please run HeartDisease.R file. Here is also a more general function LRstats.R for computing the

*G*for two-way tables. The results are discussed further below.

^{2}**Conclusion:** We reject the null hypothesis of independence because of the big values of the chi-square statistics. Notice the degrees of freedom are equal to 3 = (4-1)(2-1), and thus the *p*-value is very low. Therefore, through the *X*^{2} test for independence, we have demonstrated beyond a reasonable doubt that a relationship exists between cholesterol and CHD.

A good statistical analysis, however, should not end with the rejection of a null hypothesis. Once we have demonstrated that a relationship exists, we should describe that relationship. To do this we consider computing and evaluating the (i) residuals, (ii) measures of association, and (iii) partitioning chi-square.