10.2.5 - Homogeneous Association

The homogeneous associations model is also known as a model of no-3-way interactions. Denoted by (AB, AC, BC), the only restriction this model imposes over the saturated model is that each pairwise conditional association doesn't depend on the value the third variable is fixed at. For example, the conditional odds ratio between \(A\) and \(B\), given \(C\) is fixed at its first level, must be the same as the conditional odds ratio between \(A\) and \(B\), given \(C\) is fixed at its second level, and so on.

Main assumptions

  • The N = IJK counts in the cells are assumed to be independent observations of a Poisson random variable, and
  • there is no three-way interaction among the variables: \(\lambda_{ijk}^{ABC}=0\) for all \(i, j, k\).

Model Structure

\(\log(\mu_{ijk})=\lambda+\lambda_i^A+\lambda_j^B+\lambda_k^C+\lambda_{ij}^{AB}+\lambda_{jk}^{BC}+\lambda_{ik}^{AC}\)

In terms of the Berkeley example, this model implies that the conditional association between department and sex does not depend on the fixed value of admission status, that the conditional association between sex and admission status does not depend on the fixed value of department, and the conditional association between department and admission status does not depend on the fixed value of sex. 

Does this model fit? Even this model doesn't fit well, but it seems to fit better than the previous models. The deviance statistic is \(G^2= 20.2251\) with df= 5, and the Value/df is 4.0450.

 Stop and Think!
Can you figure out DF?

Since the only terms that separate this model from the saturated one are those for the three-way interactions, the degrees of freedom must be \((I-1)(J-1)(K-1)\), which is \((2-1)(2-1)(6-1)=5\) in this example.

In SAS, this model can be fitted as:

proc genmod data=berkeley order=data;
class D S A;
model count = D S A D*S D*A S*A / dist=poisson link=log lrci type3 obstats;
run;

Are all terms in the model significant (e.g. look at "Type 3 Analysis output"); recall we need to use option "type3". For example, here is the ANOVA-like table that shows that SA association does not seem to be significant,

 
LR Statistics For Type 3 Analysis
Source DF Chi-Square Pr > ChiSq
D 5 360.23 <.0001
S 1 252.58 <.0001
A 1 303.43 <.0001
D*S 5 1128.70 <.0001
D*A 5 763.40 <.0001
S*A 1 1.53 0.2159

Here is part of the output from the "Analysis of Parameter Estimates" given the values for all the parameters,

 
Analysis Of Maximum Likelihood Parameter Estimates
Parameter   DF Estimate Standard
Error
Likelihood Ratio 95% Confidence Limits Wald Chi-Square Pr > ChiSq
Intercept     1 3.1374 0.1567 2.8166 3.4322 400.74 <.0001
D DeptA   1 1.1356 0.1820 0.7858 1.5007 38.94 <.0001
D DeptB   1 -0.3425 0.2533 -0.8525 0.1450 1.83 0.1763
D DeptC   1 2.2228 0.1649 1.9102 2.5579 181.77 <.0001
D DeptD   1 1.7439 0.1682 1.4241 2.0848 107.52 <.0001
D DeptE   1 1.4809 0.1762 1.1439 1.8361 70.64 <.0001
D DeptF   0 0.0000 0.0000 0.0000 0.0000 . .
S Male   1 -0.0037 0.1065 -0.2126 0.2048 0.00 0.9720
S Female   0 0.0000 0.0000 0.0000 0.0000 . .
A Reject   1 2.6246 0.1577 2.3270 2.9467 276.88 <.0001
A Accept   0 0.0000 0.0000 0.0000 0.0000 . .
D*S DeptA Male 1 2.0023 0.1357 1.7394 2.2717 217.68 <.0001
D*S DeptA Female 0 0.0000 0.0000 0.0000 0.0000 . .
D*S DeptB Male 1 3.0771 0.2229 2.6595 3.5364 190.63 <.0001
D*S DeptB Female 0 0.0000 0.0000 0.0000 0.0000 . .
D*S DeptC Male 1 -0.6628 0.1044 -0.8679 -0.4587 40.34 <.0001
D*S DeptC Female 0 0.0000 0.0000 0.0000 0.0000 . .
D*S DeptD Male 1 0.0440 0.1057 -0.1633 0.2513 0.17 0.6774
D*S DeptD Female 0 0.0000 0.0000 0.0000 0.0000 . .
D*S DeptE Male 1 -0.7929 0.1167 -1.0226 -0.5652 46.19 <.0001
D*S DeptE Female 0 0.0000 0.0000 0.0000 0.0000 . .
D*S DeptF Male 0 0.0000 0.0000 0.0000 0.0000 . .
D*S DeptF Female 0 0.0000 0.0000 0.0000 0.0000 . .
D*A DeptA Reject 1 -3.3065 0.1700 -3.6510 -2.9834 378.38 <.0001
D*A DeptA Accept 0 0.0000 0.0000 0.0000 0.0000 . .
D*A DeptB Reject 1 -3.2631 0.1788 -3.6240 -2.9220 333.12 <.0001
D*A DeptB Accept 0 0.0000 0.0000 0.0000 0.0000 . .
D*A DeptC Reject 1 -2.0439 0.1679 -2.3842 -1.7247 148.24 <.0001
D*A DeptC Accept 0 0.0000 0.0000 0.0000 0.0000 . .
D*A DeptD Reject 1 -2.0119 0.1699 -2.3559 -1.6884 140.18 <.0001
D*A DeptD Accept 0 0.0000 0.0000 0.0000 0.0000 . .
D*A DeptE Reject 1 -1.5672 0.1804 -1.9300 -1.2213 75.44 <.0001
D*A DeptE Accept 0 0.0000 0.0000 0.0000 0.0000 . .
D*A DeptF Reject 0 0.0000 0.0000 0.0000 0.0000 . .
D*A DeptF Accept 0 0.0000 0.0000 0.0000 0.0000 . .
S*A Male Reject 1 0.0999 0.0808 -0.0582 0.2588 1.53 0.2167
S*A Male Accept 0 0.0000 0.0000 0.0000 0.0000 . .
S*A Female Reject 0 0.0000 0.0000 0.0000 0.0000 . .
S*A Female Accept 0 0.0000 0.0000 0.0000 0.0000 . .
Scale     0 1.0000 0.0000 1.0000 1.0000    

Note:The scale parameter was held fixed.

Recall, we are interested in the highest-order terms, thus two-way associations here, and they correspond to log-odds ratios. For example, the first coefficient 0.0999, reported in the row beginning with "S*A, male, reject", is the conditional log-odds ratio between sex and admission status. The interpretation is as follows: for a fixed department, the odds a male is rejected is \(\exp(0.0999)=1.10506\) times the odds that a female is rejected.

Although the department is fixed for this interpretation (so the comparison is among individuals applying to the same department), it doesn't matter which department we're focusing on; they all lead to the same result under this model. However, this model does not fit well, so we can't really rely on the inferences based on this model.

In R, here is one way of fitting this model (note that this syntax will also include all first-order terms automatically):

berk.ha = glm(Freq~(Admit+Gender+Dept)^2, family=poisson(), data=berk.data)

Here is part of the summary output that gives values for all the parameter estimates:

> summary(berk.ha)
...
Coefficients:
                          Estimate Std. Error z value Pr(>|z|)    
(Intercept)               3.137358   0.156723  20.019  < 2e-16 ***
AdmitRejected             2.624559   0.157728  16.640  < 2e-16 ***
GenderMale               -0.003731   0.106460  -0.035    0.972    
DeptA                     1.135552   0.181963   6.241 4.36e-10 ***
DeptB                    -0.342489   0.253251  -1.352    0.176    
DeptC                     2.222782   0.164869  13.482  < 2e-16 ***
DeptD                     1.743872   0.168181  10.369  < 2e-16 ***
DeptE                     1.480918   0.176194   8.405  < 2e-16 ***
AdmitRejected:GenderMale  0.099870   0.080846   1.235    0.217    
AdmitRejected:DeptA      -3.306480   0.169982 -19.452  < 2e-16 ***
AdmitRejected:DeptB      -3.263082   0.178784 -18.252  < 2e-16 ***
AdmitRejected:DeptC      -2.043882   0.167868 -12.176  < 2e-16 ***
AdmitRejected:DeptD      -2.011874   0.169925 -11.840  < 2e-16 ***
AdmitRejected:DeptE      -1.567174   0.180436  -8.685  < 2e-16 ***
GenderMale:DeptA          2.002319   0.135713  14.754  < 2e-16 ***
GenderMale:DeptB          3.077140   0.222869  13.807  < 2e-16 ***
GenderMale:DeptC         -0.662814   0.104357  -6.351 2.13e-10 ***
GenderMale:DeptD          0.043995   0.105736   0.416    0.677    
GenderMale:DeptE         -0.792867   0.116664  -6.796 1.07e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 2650.095  on 23  degrees of freedom
Residual deviance:   20.204  on  5  degrees of freedom
AIC: 217.26

Recall, we are interested in the highest-order terms, thus two-way associations here, and they correspond to log-odds ratios. For example, the first coefficient 0.0999, reported in the row beginning with "AdmitRejected:GenderMale", is the conditional log-odds ratio between sex and admission status. The interpretation is as follows: for a fixed department, the odds a male is rejected is \(\exp(0.0999)=1.10506\) times the odds that a female is rejected.

Although the department is fixed for this interpretation (so the comparison is among individuals applying to the same department), it doesn't matter which department we're focusing on; they all lead to the same result under this model. However, this model does not fit well (deviance statistic of 20.204 with 5 df), so we can't really rely on the inferences based on this model.