Printer-friendly versionPrinter-friendly version

How can we do the test of independence computationally? Let us illustrate that using the Vitamin C example. We are interested in knowing whether the treatment type and contracting cold are associated?

SAS logoHere is an example in SAS using the program code VitaminC.sas; for R code see below.

sas program

Let's look at different parts of the output. SAS output for this program: VitaminC.lst

Heading I: Table of treatment by response section produces the table with observed, expected values, sample proportions, and conditional probabilities.

sas output

Heading II: Statistics for Table of treatment by response section produces various test statistics, such as X2 and G2.

SAS output

\(\chi^2_1\) = 4.8114 and G2 = 4.8717, with df=1, indicate strong evidence for rejecting the independence model. Continuity Adj. Chi-Square = 4.1407 with p-value = 0.0419, is the Pearson's X2 adjusted, with so called Yates' continuity correction, for better approximation of exact inference tests which is useful when we have a small cell counts. This correction subtracts 0.5 from a difference between the observed and the expected counts in the formula for the χ2 statistic, e.g., {oij-nij}-0.5. The χ2 statistic with correction gives conservative inference, that is it gives a bigger p-value than the usual Pearson χ2 statistic without the correction. But since SAS can produce exact tests we won't need to consider this statistic.

The remaining parts of the output will be discussed later in this lesson. This remaining output can be viewed using the 'Inspect' viewlet above, opening the VitaminC.lst file or running the SAS program and generating these results on your own.

R logoHere is the test of independence for Vitamin C example, also found in the section with R files VitaminC.R and its output, VitaminC.out.

VitaminC R code

Notice that by default chisq.test() function in R will give us the χ2 statistic with Yates' continuity correction. This correction subtracts 0.5 from a difference between the observed and the expected counts in the formula for the χ2 statistic, e.g., {oij-nij}-0.5. It is used in situations when there are cells with small expected counts (e.g., less than 5) in order to better approximate exact inference tests. The χ2 statistic with correction gives conservative inference, that is it gives a bigger p-value than the usual Pearson χ2 statistic without the correction. To get the usual χ2 you need to invoke an option correct = FALSE; see the code.

\(\chi^2_1\) = 4.8114 and G2 = 4.8717, with df=1, indicate strong evidence for rejecting the independence model. The Chi-Square value= 4.1407 with p-value = 0.0419, is the Pearson's X2 adjusted, with the Yates' continuity correction, for better approximation of exact inference tests which is useful when we have a small cell counts. To compute the deviance statistic for 2-way tables, we can write our own code, use the function LRstats.R created for this class or use one of the R packages, such as VCD; examples of all three are provided.  Here are two brief videos explaining parts of the R code and the output: