3.8 - Measures of Associations in \(I \times J\) tables

In the Coronary Heart example, it is sensible to think of serum cholesterol as an explanatory variable and CHD as a response. Therefore, it would make sense to estimate the conditional probabilities of CHD within the four cholesterol groups. To do this, we simply divide each cell count \(n_{ij}\) by its column total \(n_{+j}\); the resulting proportion \(n_{ij}/n_{+j}\) is an estimate of \(P(Y = i |Z = j)\). To see this, note that


and is intuitively estimated by


These values correspond to "Col Pct" in the SAS output. In R, we need to calculate them based on the above formula, e.g., see HeartDisease.R. The result is shown below.

  0-199 200-219 220-259 260+
CHD 12/319
= .038
= .031
= .066
= .143
no CHD 307/319
= .962
= .969
= .934
= .857

The risk of CHD appears to be essentially constant for the groups with cholesterol levels between 0–199 and 200–219. Although the estimated probability drops from .038 to .031, this drop is not statistically significant. We can test this by doing a test for the difference in proportions or by doing a chi-square test of independence for the relevant \(2 \times 2\) sub-table:

  0-199 200-219
CHD 12/319
= .038
= .031
no CHD 307/319
= .962
= .969

The test yields a \(X^2 = 0.157\) with df=1, \(p\)-value = .69. For the other two groups, however, the risk of CHD is substantially higher. We can do similar tests for other sets of cells. In fact, any two levels of cholesterol may be compared and tested for association between CHD and cholesterol level.

Describing associations in \(I \times J\) tables Section

In a \(2 \times 2\) table, the relationship between the two binary variables could be summarized by a single number (e.g., odds ratio). For an \(I \times J\) table, the usual \(X^2\) or \(G^2\) test for independence has \((IJ − 1) − (I − 1) − (J − 1) = (I − 1)(J − 1)\) degrees of freedom. This means that, with \(I > 2\) or \(J > 2\), there are multiple dimensions to the manner in which the data can depart from independence. The direction and magnitude of the departure from the null hypothesis can no longer be summarized by a single number, but must be summarized by \((I −1)(J −1)\) numbers of (i) difference in proportions, and/or (ii) relative risk, and/or (iii) odds ratios.

In the Coronary Heart Disease study, for example, we could summarize the relationship between CHD and cholesterol level by a set of three relative risks:

  • 200–219 versus 0–199,
  • 220–259 versus 0–199, and
  • 260+ versus 0–199.

That is, we could estimate the risk of CHD at each cholesterol level relative to a common baseline. Or, we could use

  • 200–219 versus 0–199,
  • 220–259 versus 200–219, and
  • 260+ versus 220–259,

This estimates the risk of each category relative to the category immediately below. Other comparisons are also possible, but they may not make sense in interpreting the data).

Note! You can do this as an exercise by modifying the code in R or SAS.

Example: Smoking Behaviors Section

The table below classifies 5375 high school students according to the smoking behavior of the student \(Z\) and the smoking behavior of the student’s parents \(Y\). We are interested in analyzing if there is a relationship of smoking behavior between the students and their parents?

How many parents smoke? Student smokes?
  Yes (Z = 1) No (Z = 2)
Both (Y = 1) 400 1380
One (Y = 2) 416 1823
Neither (Y = 3) 188 1168

The test for independence yields \(X^2 = 37.6\), and \(G^2 = 38.4\) with df = 2 (\(p\)-values are essentially zero), so we have decided that \(Y\) and \(Z\) are related. It is natural to think of \(Z\) in this example as a response and \(Y\) as a predictor, so we will discuss the conditional distribution of \(Z\) given \(Y\). Let \(\pi_i = P(Z = 1|Y = i)\), for \(i=1,2,3\). The estimates of these probabilities are




We can then compare these as risks associated with the parameters. The effect of \(Y\) on \(Z\) can be summarized with two differences. For example, we can calculate the increase in the probability of \(Z = 1\) as \(Y\) goes from 3 to 2, and as \(Y\) goes from 2 to 1:



Alternatively, we may treat \(Y = 3\) as a baseline and calculate the increase in probability as we go from \(Y = 3\) to \(Y = 2\) and from \(Y = 3\) to \(Y = 1\):



We may also express the effects as the sample odds ratios (e.g., look at any \(2\times 2\) table within this larger \(3 \times 2\) table):

\(\hat{\theta}_{23}=\dfrac{416\times 1168}{188\times 1823}=1.42\)

\(\hat{\theta}_{13}=\dfrac{400\times 1168}{188\times 1380}=1.80\)

The estimated value of 1.42 means that students with one smoking parent are estimated to be 42% more likely (on the odds scale) to smoke than students whose parents do not smoke (the last two rows of the table). The value of 1.80 means that students with two smoking parents are 80% more likely to smoke than students whose parents do not smoke (the first and the last rows of the table).

In a \(3 \times 2\) table, the relationship between the two variables must be summarized with two differences in proportions or two relative risks or two odds ratios. More generally, to describe the relationship between the two variables in an \(I × J\) table will require \((I − 1)(J − 1)\) numbers. You can specify a large number of different odds ratios depending on the size of the table, yet the minimum number of these ratios that efficiently describes the data is described as \((I - 1)(J - 1)\) number of ratios. There is a relationship between the minimum number of odds ratios and degrees of freedom for testing independence. Which odds ratios are most meaningful to the researcher depends on the research question at hand.

Besides the point estimates, we can also test hypotheses about the odds ratios or compute confidence intervals. You could do the same for the relative risks or difference in proportions as we discussed in previous sections. To do this computationally in SAS and/or R, we need to analyze each \(2\times2\) sub-table separately. Basically, we treat each \(2\times2\) table as a "new" data set.

In SAS the OPTION ALL should give all possible measures; see smokeindep.sas (output,  smokeindep SAS output). Depending which SAS version you are using the OPTIONS may be different, e.g., RELRISK, RRC1, RRC2, etc., and some of them work only for \(2\times2\) tables. For the current list, see the current SAS Support Documentation

In R, see smokeindep.R (output, smokeindep.out). The {vcd} package has a number of useful functions, e.g., oddsratio(), assocstats(); the latter will give  \(X^2\), \(G^2\), and some other measures of associations, such as Cramer's V.

Statistical versus Practical Significance Section

In proposing measures of effect size, we need to realize that there is a difference between saying that an effect is statistically significant and saying that it is large.

A test statistic or p-value is a measure of the evidence against a null hypothesis, and this evidence depends on the sample size. An effect size, however, should not change if n is arbitrarily increased.

In some situations, there may be an artificial dependency of statistical significance on sample size. If the sample size is small, and large-sample goodness-of-fit statistic is computed, the \(p\)-value may not be the best statistic to depend upon because the large-sample theory will not hold. Alternatively, if the sample size is very large you may obtain significant results where there really should not be one. Also, recall Type I and Type II errors of hypothesis testing.

The \(X^2\) and \(G^2\) test statistics are not appropriate measures of association between two variables. They are sufficient to test the null hypothesis, but not to describe the direction and magnitude of the association.

Here is a hypothetical example that will help to illustrate this point. First, consider the Vitamin C example again. The following table classifies a sample of French skiers by whether they got a cold, or not given a placebo or Vitamin C (ascorbic acid).

  Cold No Cold Totals
Placebo 31 109 140
Ascorbic Acid 17 122 139
Totals 48 231 279

The test for independence yields \(X^2 = 4.814\) and the \(p\)-value=0.0283. The conclusion here would be that there is strong evidence against the null hypothesis of independence. Now, let's suppose that we artificially inflate the sample size by multiplying each cell count by ten.

  Cold No Cold Totals
Placebo 310 1090 1400
Ascorbic Acid 170 1220 1390
Totals 480 2310 2790

The cell proportions for this table remain identical to those of the previous table; the relationship between the two binary variables appears to be exactly the same. Yet, now the \(X^2\) statistic is \(10(4.811) = 48.11\). The new \(p\)-value is close to 0, so the evidence against independence is now VERY strong to reject the null---not because the relationship between the two variables is any stronger, but merely because the sample size has gone up.

Warning: A large \(p\)-value is NOT strong evidence in favor of \(H_0\). A large \(p\)-value can occur if (1) \(H_0\) is indeed true, or (2) \(H_0\) is false, but the test has low power.

Now, let's suppose that we artificially deflate the sample size by dividing each cell count by ten, and account for rounding.

  Cold No Cold Totals
Placebo 3 11 14
Ascorbic Acid 2 12 14
Totals 5 23 28

The cell proportions for this table remain nearly identical to those of the previous tables; the relationship between the two binary variables appears to be the same. Yet, now the \(X^2\) statistic is 0.2435. The new \(p\)-value is 0.6217 which gives little or no evidence against independence, but it does not tell us whether the weakness of association is due to (a) weak correlation between the two variables, or (b) the sample size being too small. Moreover, \(X^2\) says nothing about the direction of the possible effect, e.g. whether vitamin C takers are more likely or less likely to be sick than non-takers of vitamin C.

  and   Notes

The above analysis was implemented in VitaminC.sas (output file VitaminC SAS Output). The corresponding R code file is VitaminC.R and is commented so that results similar to those in SAS can be obtained.