Printer-friendly versionPrinter-friendly version

Homogeneous association implies that the conditional relationship between any pair of variables given the third one is the same at each level of the third variable. This model is also known as a no three-factor interactions model or no second-order interactions model.

There is really no graphical representation for this model, but the log-linear notation is (AB, BC, AC), indicating that if we know all two-way tables, we have sufficient information to compute the expected counts under this model. In the log-linear notation the saturated model or the three-factor interaction model is (ABC). The homogeneous association model is “intermediate” in complexity, between the conditional independence model, (AB, AC) and the saturated model, (ABC).

If we take the conditional independence model (AB, AC)

plot

and add a direct link between B and C, we obtain the saturated model, (ABC).

plot

  • The saturated model, (ABC), allows the BC odds ratios at each level of A = 1, . . . , I to be arbitrary.
  • The conditional independence model, (AB, AC) requires the BC odds ratios at each level of A = 1, . . . , I to be equal to 1.
  • The homogeneous association model, (AB, AC, BC), requires the BC odds ratios at each level of A to be identical, but not necessarily equal to 1.

Under the model of homogeneous association, there are no closed-form estimates for the expected cell probabilities like we have derived for the previous models. ML estimates must be computed by an iterative procedure. The most popular methods are

  • iterative proportional fitting (IPF), and
  • Newton Raphson (NR).

We will be able to fit this model later using software for logistic regression or log-linear models. For now, we will consider testing for the homogeneity of the odds-ratios via the Breslow-Day statistic.

Breslow-Day Statistic

To test the hypothesis that the odds ratio between A and B is the same at each level of C

H0 : θAB(1) = θAB(2) = .. = θAB(k)

there is a non-model based statistic, Breslow-Day statistic (Agresti, Sec. 6.3.6) which is like Pearson X2

\(X^2=\sum\limits_i\sum\limits_j\sum\limits_k \dfrac{(O_{ijk}-E_{ijk})^2}{E_{ijk}}\)

where the Eijk are calculated assuming the above H0 is true, that is there is a common odds ratio across the level of the third variable.

The Breslow-Day statistic

  • has approximately chi-squared distribution with df = K − 1, given large sample size, and under H0
  • it does not work well for small sample size, while CMH could work fine
  • has not been generalized for I × J × K tables, while there is such generalization for the CMH

If we reject the conditional independence with the CMH test, we should still test for homogeneous associations. 

Example - Boy Scouts and Juvenile Delinquency

We should expect that the homogeneous association model fits well for the boys scout example, since we already concluded that the conditional independence model fits well and the conditional independence model is a special case of the homogeneous association model. Let's see how we compute the Breslow-Day statistic in SAS or R.

SAS logo  In SAS, the  cmh option will produce the Breslow-Day statistic; boys.sas

For the boy scout example, the Breslow-Day statistic is 0.15 with df = 2, p-value = 0.93. We do NOT have sufficient evidence to reject the model of homogeneous associations. Furthermore, the evidence is strong that associations are very similar across different levels of socioeconomic status.

SAS output

The expected odds ratio for each table are: \(\hat{\theta}_{BD(high)}=1.20\approx \hat{\theta}_{BD(mild)}=0.89\approx \hat{\theta}_{BD(low)}=1.02\)

In this case, the common odds estimate from the CMH test is a good estimate of the above values, i.e., common OR=0.978 with 95% confidence interval (0.597, 1.601).

Of course, this was to be expected for this example, since we already concluded that the conditional independence model fits well, and the conditional independence model is a special case of the homogeneous association model.

__________________________

R logo  There is not a single built-in function in R that will compute the Breslow-Day statistic.

You need to either write your own function, use log-linear models, e.g. loglin() or glm() in R to fit the homogeneous association model to test the above hypothesis, or use the function breslowday.test() provided in this R code file breslowday.test_.R.  This is being called in the R code file boys.R below.

breslow day R code

Here is the resulting output:

breslow day R output

For the boy scout example, the Breslow-Day statistic is 0.15 with df = 2, p-value = 0.93. We do NOT have sufficient evidence to reject the model of homogeneous associations. Furthermore, the evidence is strong that associations are very similar across different levels of socioeconomic status.

The expected odds ratio for each table are: \(\hat{\theta}_{BD(high)}=1.20\approx \hat{\theta}_{BD(mild)}=0.89\approx \hat{\theta}_{BD(low)}=1.02\)

In this case, the common odds estimate from the CMH test is a good estimate of the above values, i.e., common OR=0.978 with 95% confidence interval (0.597, 1.601).

Of course, this was to be expected for this example, since we already concluded that the conditional independence model fits well, and the conditional independence model is a special case of the homogeneous association model.

We will see more about these models and functions in both R and SAS in the upcoming lessons.

Example - Graduate Admissions

The 6 × 2 × 2 table below lists graduate admissions information for the six largest departments at U.C. Berkeley in the fall of 1973. For additional marginal tables see the Introduction section.

Dept.
Men rejected
Men accepted
Women rejected
Women accepted
A
313
512
19
89
B
207
353
8
17
C
205
120
391
202
D
278
139
244
131
E
138
53
299
94
F
351
22
317
24

Let D = department, S =sex,and A = admission status (rejected or accepted). We could try to assess if there is a gender admission bias or if there is difference in admissions across departments.

In this case we need to test the two hypothesis: first, if gender is marginally independent of admission, and second if their relationship can be explained by department. That is, test if:

θSA(a) = θSA(b) = θSA(c) = θSA(e) = θSA(e) = θSA(f) = 1

SAS logoSee berkeley.sas that performs this test, and the output in berkeley.lst.

R logoSee berkeley.R that performs this test.  This program uses a dataset already in R, or see berkeley1.R that reads the data file berkeley.txt). The outputs for these R programs can be found in berkeley.out and berkeley1.out, respectively.

For the test of marginal independence of gender and admission (see the marginal table):

G2 = 93.92, df = 1, p-value=0.0001.

All the expected values are greater than five, so we can rely on the large sample approximation and conclude that admission and gender are dependent. The estimated odds ratio, 0.5423, with 95% confidence interval (0.4785, 0.6147), indicates that odds of acceptance are two times higher for males than for females.

But this relationship can be (apparently) explained by influence of variable department. CMH = 1.43, df = 1, with p-value = 0.23 indicates that gender and admissions are conditionally independent given department. Mantel-Haenszel estimate of the common odds ratio is 1.102 and 95% CI (0.94, 1.29).

However, the Breslow-Day statistics testing for the homogeneity of the odds ratio is 18.83, df = 5, p-value = 0.002!

Discuss   What is going on? Recall from the previous page and properties of CMH test that it is not appropriate if the odds-ratios are not in the same direction!  So when you do the CMH test it's important to check the validity of such assumptions too!!!  What happens if you drop one of the departments? We will see these data in homework.

Another interesting aspect is a visualization of these odds-ratios in partial tables. You can further explore this on your own by checking out the Fourfold plots in R and SAS, e.g., fourfold() function in R's {vcd} package, and fourfold.sas from Michael Friendly's webpage. Exploring table visualization could be an interesting class project!