10.2 - Log-linear Models for Three-way Tables

In this section we will extend the concepts we learned about log-linear models for two-way tables to three-way tables. We will learn how to fit various models of independence discussed in Lesson 5 (e.g., conditional independence, joint independence, etc.) and will see additional statistics, besides the usual \(X^2\) and \(G^2\), to assess the model fit and to choose the "best" model.

The notation for log-linear models extends to three-way tables as follows. If \(\mu_{ijk}\) represents the mean for the \(i\)th, \(j\)th, and \(k\)th levels of variables \(A\), \(B\), and \(C\), respectively, then we write

\(\log(\mu_{ijk})=\lambda+\lambda_i^A+\lambda_j^B+\lambda_k^C+\lambda_{ij}^{AB}+\lambda_{ik}^{AC}+\lambda_{jk}^{BC}+\lambda_{ijk}^{ABC}\)

The main questions of interest are:

What do the \(\lambda\) terms mean in this model?
What hypotheses about them correspond to the models of independence we already know?
What are some efficient ways to specify and interpret these models and tables?
What are some efficient ways to fit and select among many possible models in three and higher dimensions?

As before for three-way tables, there are multiple models we can test and now fit. The log-linear models we will fit and evaluate are

Saturated
Complete Independence
Joint Independence
Conditional Independence
Homogeneous Association

Example: Graduate Admissions

Let us go back to our familiar dataset on graduate admissions at Berkeley:

Dept.	Males admitted	Males rejected	Females admitted	Females rejected
A	512	313	89	19
B	353	207	17	8
C	120	205	202	391
D	139	279	131	244
E	53	138	94	299
F	22	351	24	317

Let D = department, S = sex, and A = admission status (rejected or accepted). We analyzed this as a three-way table before, and specifically, we looked at partial and marginal tables. Now we will look at it from a log-linear model point of view. Let \(Y\) be the observed frequency or count in a particular cell of the three-way table.

10.2.1 - Saturated Log-Linear Model

This model is the default model in a way that serves for testing of goodness-of-fit of the other models. Recall that the saturated model has the maximum number of parameters and fitting a saturated model is the same as estimating ML parameters of distributions appropriate for each cell of the contingency table.

Main assumption

The \(N = IJK\) counts in the cells are assumed to be independent observations of a Poisson random variable.

Model structure

\(\log(\mu_{ijk})=\lambda+\lambda_i^A+\lambda_j^B+\lambda_k^C+\lambda_{ij}^{AB}+\lambda_{ik}^{AC}+\lambda_{jk}^{BC}+\lambda_{ijk}^{ABC}\)

Parameter constraints can be different, but the (typical) set-to-zero constraints imply that any \(\lambda\) for which a subscript corresponds to the last category (i.e., includes \(I\), \(J\), or \(K\)) is set to 0. For example,

\(\lambda_I^A=\lambda_J^B=\lambda_K^C=\lambda_{Ij}^{AB}=\cdots=\lambda_{ijK}^{ABC}=0\)

Parameter estimation and interpretation

\(\lambda\) represents an "overall" effect or a grand mean (on the log scale) of the expected counts, and it ensures that \(\sum_i\sum_j\sum_k\mu_{ijk}=n\).
\(\lambda_i^A\), \(\lambda_j^B\), and \(\lambda_k^C\) represent the "main" effects of variables \(A\), \(B\), and \(C\), or deviations from the grand mean.
\(\lambda_{ij}^{AB}\), \(\lambda_{ik}^{AC}\), \(\lambda_{jk}^{BC}\) represent the interaction/association between two variables while controlling the third (e.g., conditional odds ratios, test for partial associations) and reflect the departure from independence
\(\lambda_{ijk}^{ABC}\) represents the interaction/association between three variables and reflect the departure from independence

If there is a significant interaction term, we typically do not look at the lower-order terms but only interpret the higher-order terms because the values of lower-order terms are coding dependent and can be misleading.

Model Fit

The saturated model has a perfect fit with \(G^2=0\) and df = 0 since the number of cells is equal to the number of unique parameters in the model.

Model Selection

Relevant when comparing to simpler models. The saturated model is the most complex model possible.

Fitting in SAS and R

Using PROC GENMOD, let us fit the saturated log-linear model.

proc genmod data=berkeley order=data;
class D S A;
model count= D S A D*S D*A S*A D*S*A/dist=poisson link=log;
run;

When we use the order=data option, GENMOD orders the levels of class variables in the same order as they appear in the dataset. For each class variable, GENMOD creates a set of dummy using the last category as a reference group (recall the CATMOD and GENMOD coding from the previous lesson). Therefore, we can interpret a two-way association as the log-odds ratio for the two variables in question, with the other variable held constant at its last category (i.e., conditional odds-ratios).

Here's a portion of the SAS output that includes ML estimates.


Analysis Of Maximum Likelihood Parameter Estimates
Parameter				DF	Estimate	Standard Error
Intercept				1	3.1781	0.2041
D	DeptA			1	1.3106	0.2300
D	DeptB			1	-0.3448	0.3170
D	DeptC			1	2.1302	0.2159
D	DeptD			1	1.6971	0.2220
D	DeptE			1	1.3652	0.2287
D	DeptF			0	0.0000	0.0000
S	Male			1	-0.0870	0.2952
S	Female			0	0.0000	0.0000
A	Reject			1	2.5808	0.2117
A	Accept			0	0.0000	0.0000
D*S	DeptA	Male		1	1.8367	0.3167
D*S	DeptA	Female		0	0.0000	0.0000
D*S	DeptB	Male		1	3.1203	0.3857
D*S	DeptB	Female		0	0.0000	0.0000
D*S	DeptC	Male		1	-0.4338	0.3169
D*S	DeptC	Female		0	0.0000	0.0000
D*S	DeptD	Male		1	0.1391	0.3194
D*S	DeptD	Female		0	0.0000	0.0000
D*S	DeptE	Male		1	-0.4860	0.3415
D*S	DeptE	Female		0	0.0000	0.0000
D*S	DeptF	Male		0	0.0000	0.0000
D*S	DeptF	Female		0	0.0000	0.0000
D*A	DeptA	Reject		1	-4.1250	0.3297
D*A	DeptA	Accept		0	0.0000	0.0000
D*A	DeptB	Reject		1	-3.3346	0.4782
D*A	DeptB	Accept		0	0.0000	0.0000
D*A	DeptC	Reject		1	-1.9204	0.2288
D*A	DeptC	Accept		0	0.0000	0.0000
D*A	DeptD	Reject		1	-1.9589	0.2378
D*A	DeptD	Accept		0	0.0000	0.0000
D*A	DeptE	Reject		1	-1.4237	0.2425
D*A	DeptE	Accept		0	0.0000	0.0000
D*A	DeptF	Reject		0	0.0000	0.0000
D*A	DeptF	Accept		0	0.0000	0.0000
S*A	Male	Reject		1	0.1889	0.3052
S*A	Male	Accept		0	0.0000	0.0000
S*A	Female	Reject		0	0.0000	0.0000
S*A	Female	Accept		0	0.0000	0.0000
DSA	DeptA	Male	Reject	1	0.8632	0.4027
DSA	DeptA	Male	Accept	0	0.0000	0.0000
DSA	DeptA	Female	Reject	0	0.0000	0.0000
DSA	DeptA	Female	Accept	0	0.0000	0.0000
DSA	DeptB	Male	Reject	1	0.0311	0.5335
...	...	...	...	...	...	...

Note:The scale parameter was held fixed.

By default, R will set the first category (first alphabetically) as the reference level with the set-to-zero constraint, which we can change manually if we wish. The results will be equivalent, but the interpretations of the estimates reported will be different. With this choice of reference, we can interpret a two-way association as the log-odds ratio for the two variables in question, with the other variable held constant at its last category (i.e., conditional odds-ratios). Also, note that the built-in data set UCBAdmissions uses "Gender" instead of "Sex".

UCBAdmissions 
berk.data = as.data.frame(UCBAdmissions)
berk.data$Gender = relevel(berk.data$Gender, ref='Female')
berk.data$Dept = relevel(berk.data$Dept, ref='F')
berk.sat = glm(Freq~Admit*Gender*Dept, family=poisson(), data=berk.data)

Here's a portion of the summary output that includes ML estimates.

> summary(berk.sat)

Call:
glm(formula = Freq ~ Admit * Gender * Dept, family = poisson(), 
    data = berk.data)

Deviance Residuals: 
 [1]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

Coefficients:
                               Estimate Std. Error z value Pr(>|z|)    
(Intercept)                     3.17805    0.20412  15.569  < 2e-16 ***
AdmitRejected                   2.58085    0.21171  12.190  < 2e-16 ***
GenderMale                     -0.08701    0.29516  -0.295   0.7682    
DeptA                           1.31058    0.23001   5.698 1.21e-08 ***
DeptB                          -0.34484    0.31700  -1.088   0.2767    
DeptC                           2.13021    0.21591   9.866  < 2e-16 ***
DeptD                           1.69714    0.22204   7.644 2.11e-14 ***
DeptE                           1.36524    0.22870   5.969 2.38e-09 ***
AdmitRejected:GenderMale        0.18890    0.30516   0.619   0.5359    
AdmitRejected:DeptA            -4.12505    0.32968 -12.512  < 2e-16 ***
AdmitRejected:DeptB            -3.33462    0.47817  -6.974 3.09e-12 ***
AdmitRejected:DeptC            -1.92041    0.22876  -8.395  < 2e-16 ***
AdmitRejected:DeptD            -1.95888    0.23781  -8.237  < 2e-16 ***
AdmitRejected:DeptE            -1.42370    0.24250  -5.871 4.33e-09 ***
GenderMale:DeptA                1.83670    0.31672   5.799 6.66e-09 ***
GenderMale:DeptB                3.12027    0.38572   8.090 5.99e-16 ***
GenderMale:DeptC               -0.43376    0.31687  -1.369   0.1710    
GenderMale:DeptD                0.13907    0.31938   0.435   0.6632    
GenderMale:DeptE               -0.48599    0.34151  -1.423   0.1547    
AdmitRejected:GenderMale:DeptA  0.86318    0.40267   2.144   0.0321 *  
AdmitRejected:GenderMale:DeptB  0.03113    0.53349   0.058   0.9535    
AdmitRejected:GenderMale:DeptC -0.31382    0.33741  -0.930   0.3523    
AdmitRejected:GenderMale:DeptD -0.10691    0.34013  -0.314   0.7533    
AdmitRejected:GenderMale:DeptE -0.38908    0.36500  -1.066   0.2864    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 2.6501e+03  on 23  degrees of freedom
Residual deviance: 1.4788e-13  on  0  degrees of freedom
AIC: 207.06

The intercept is a normalizing constant and should be ignored. The main effects for D, S, and A are all difficult to interpret and not very meaningful since we have significant two-way and three-way associations; the two- and three-way associations are highly meaningful. For example, the estimated coefficient for the SA association is 0.1889.

Exponentiating this coefficient gives

\(\exp(0.1889)=1.208\),

which is the estimated SA odds ratio for Department F since that is the reference department in this analysis. The reference group for S is "female," and the reference group for A is "accept." If we write the \(2 \times 2\) table for \(S \times A\) in Department F, i.e., the "partial" table, with the reference groups in the last row and column, we get

Dept F	Reject	Accept
Men	351	22
Women	317	24

for which the estimated odds ratio is:

\(351 \times 24/317 \times 22=1.208\).

The Wald z-statistic for this coefficient,

\(z=0.1889/0.3052\),

which corresponds to Chi-square statistic \(0.62^2 = 0.38\) with p-value 0.5359 and indicates that the SA odds ratio for Department F is not significantly different from 1.00 or that the log odds ratio is not significantly different from 0.

The 95% confidence interval for the parameter estimate, that is, for the log odds ratio, is \((−0.4092, 0.7870)\). Thus the 95% confidence interval for the odds ratio is

\((\exp(-0.4092),\exp(0.7870))=(0.67,2.20)\).

To get the SA odds ratio for any other department, we have to combine the SA coefficient with one of the DSA coefficients. For example, the SA odds ratio for Department A is

\(\exp(0.1889+0.8632)=2.864\).

Based on:

\begin{align} \theta_{SA(i="A")} &= \dfrac{\mu_{ijk} \mu_{ij'k'}}{\mu_{ij'k} \mu_{ijk'}}\\ &= \\ &= (\lambda_{jk}+\lambda_{j'k'}-\lambda_{j'k}-\lambda_{jk'})+(\lambda_{ijk}+\lambda_{ij'k'}-\lambda_{ij'k}-\lambda_{ijk'})\\ &= (0.1889+0-0-0)+(0.8632+0-0-0) \end{align}

The Wald z-statistic for the first DSA coefficient,

\(z=0.8632/0.4027\),

indicates that the SA odds ratio for Department A is significantly different from the SA odds ratio in Department F. To see if the SA odds ratio in Department A is significantly different from 1.00, we would have to compute the standard error for the sum of the two coefficients using the estimated covariance matrix, or refit the model by fixing the level of interest equal to 0.

In many situations, however, we take recourse to the saturated model only as a last resort. As the number of variables grow, saturated models become more and more difficult to interpret. In the following sections, we look at simpler models which are useful in explaining the associations among the discrete variables of interest.

10.2.2 - Complete Independence

This is the most restrictive model in that all variables are assumed to be jointly independent, regardless of any conditioning. Equivalently, this requires that each two and three-way distribution factors into the product of the marginal distributions involved.

Main assumptions

The \(N = IJK\) counts in the cells are assumed to be independent observations of a Poisson random variable, and
there are no partial interactions: \(\lambda_{ij}^{AB} =\lambda_{ik}^{AC} =\lambda_{jk}^{BC}=0\), for all \(i, j, k\), and
there is no three-way interaction: \(\lambda_{ijk}^{ABC}=0\) for all \(i, j, k\).

Note the constraints above are in addition to the usual set-to-zero or sum-to-zero constraints (present even in the saturated model) imposed to avoid overparameterization.

Model Structure

\(\log(\mu_{ijk})=\lambda+\lambda_i^A+\lambda_j^B+\lambda_k^C\)

In SAS, the model of complete independence (D, S, A) can be fitted with the following commands:

proc genmod data=berkeley order=data;
class D S A;
model count = D S A / dist=poisson link=log;
run;

What are the estimated odds of male vs. female in this example? From the output, the ML estimate of the parameter S-Male, thus, the odds of being male are higher than being female applicant:

\(\exp(0.382) = 1.467 = 2691/1835\)

with p-value < .0001 indicating that the odds are significantly different. Note these are odds, not odds ratios! (When we are dealing with main effects we do not look at odds ratios.)

What about the odds of being rejected? What can we conclude from the part of the output below?


Analysis Of Maximum Likelihood Parameter Estimates
Parameter		DF	Estimate	Standard Error	Wald 95% Confidence Limits		Wald Chi-Square	Pr > ChiSq
Intercept		1	4.7207	0.0455	4.6315	4.8100	10748.0	<.0001
D	DeptA	1	0.2675	0.0497	0.1701	0.3650	28.95	<.0001
D	DeptB	1	-0.1993	0.0558	-0.3086	-0.0900	12.77	0.0004
D	DeptC	1	0.2513	0.0499	0.1535	0.3491	25.37	<.0001
D	DeptD	1	0.1037	0.0516	0.0025	0.2048	4.04	0.0445
D	DeptE	1	-0.2010	0.0558	-0.3103	-0.0916	12.98	0.0003
D	DeptF	0	0.0000	0.0000	0.0000	0.0000	.	.
S	Male	1	0.3829	0.0303	0.3235	0.4422	159.93	<.0001
S	Female	0	0.0000	0.0000	0.0000	0.0000	.	.
A	Reject	1	0.4567	0.0305	0.3969	0.5165	224.15	<.0001
A	Accept	0	0.0000	0.0000	0.0000	0.0000	.	.
Scale		0	1.0000	0.0000	1.0000	1.0000

Note:The scale parameter was held fixed.

But, we should really check the overall fit of the model first, to determine if these estimates are meaningful.

Model Fit

The goodness-of-fit statistics indicate that the model does not fit.


Criteria For Assessing Goodness Of Fit
Criterion	DF	Value	Value/DF
Deviance	16	2097.6712	131.1045
Scaled Deviance	16	2097.6712	131.1045
Pearson Chi-Square	16	2000.3281	125.0205
Scaled Pearson X2	16	2000.3281	125.0205
Log Likelihood		19464.3700
Full Log Likelihood		-1128.3655
AIC (smaller is better)		2272.7309
AICC (smaller is better)		2282.3309
BIC (smaller is better)		2282.1553

If the model fits well, the "Value/DF" would be close to 1. Recall how we get the degrees of freedom:

df = number of cells - number of fitted parameters in the model.

df = number of fitted parameters in the saturated model - number of fitted parameters in our model.

Recall that these goodness-of-fit statistics compare the fitted model to the saturated model. Thus, the model of complete independence does not fit well in comparison to the saturated model.

In R, the model of complete independence can be fit with the following commands:

berk.ind = glm(Freq~Admit+Gender+Dept, family=poisson(), data=berk.data)

What are the estimated odds of male vs. female in this example? From the output, the ML estimate of the parameter GenderMale, thus, the odds of being male are higher than being female applicant:

\(\exp(0.382) = 1.467 = 2691/1835\)

with p-value < 2e-16, indicating that the odds are significantly different. Note these are odds, not odds ratios! (When we are dealing with main effects we do not look at odds ratios.)

> summary(berk.ind)

Call:
glm(formula = Freq ~ Admit + Gender + Dept, family = poisson(), 
    data = berk.data)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-18.170   -7.719   -1.008    4.734   17.153  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)    4.72072    0.04553 103.673  < 2e-16 ***
AdmitRejected  0.45674    0.03051  14.972  < 2e-16 ***
GenderMale     0.38287    0.03027  12.647  < 2e-16 ***
DeptA          0.26752    0.04972   5.380 7.44e-08 ***
DeptB         -0.19927    0.05577  -3.573 0.000352 ***
DeptC          0.25131    0.04990   5.036 4.74e-07 ***
DeptD          0.10368    0.05161   2.009 0.044533 *  
DeptE         -0.20098    0.05579  -3.602 0.000315 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 2650.1  on 23  degrees of freedom
Residual deviance: 2097.7  on 16  degrees of freedom
AIC: 2272.7

What about the odds of being rejected? What can we conclude from the part of the output above?

But, we should really check the overall fit of the model first, to determine if these estimates are meaningful.

Model Fit

The reported "Residual deviance" of 2097.7 on 16 degrees of freedom indicates that the model does not fit. If the model fits well, the "Value/DF" would be close to 1. Recall how we get the degrees of freedom:

df = number of cells - number of fitted parameters in the model.

df = number of fitted parameters in the saturated model - number of fitted parameters in our model.

Recall that this goodness-of-fit statistic compares the fitted model to the saturated model. Thus, the model of complete independence does not fit well in comparison to the saturated model.

Next, let us see an example of the joint independence model.

10.2.3 - Joint Independence

With three variables, there are three ways to satisfy joint independence. The assumption is that one variable is independent of the other two, but the latter two can have an arbitrary association. For example, if \(A\) is jointly independent of \(B\) and \(C\), which we denote by \((A, BC)\), then \(A\) is independent of each of the others, and we can factor the three-way joint distribution into the product of the marginal distribution for \(A\) and the two-way joint distribution of \((B,C)\).

Main assumptions

(these are stated for the case \((A,BC)\) but hold in a similar way for \((B, AC)\) and \((C, AB)\) as well)

The \(N = IJK\) counts in the cells are assumed to be independent observations of a Poisson random variable, and
there are no partial interactions involving \(A\): \(\lambda_{ij}^{AB} =\lambda_{ik}^{AC} =0\), for all \(i, j, k\), and
there is no three-way interaction: \(\lambda_{ijk}^{ABC}=0\) for all \(i, j, k\).

Note the constraints above are in addition to the usual set-to-zero or sum-to-zero constraints (present even in the saturated model) imposed to avoid overparameterization.

Model Structure

\(\log(\mu_{ijk})=\lambda+\lambda_i^A+\lambda_j^B+\lambda_k^C +\lambda_{jk}^{BC}\)

In the example below, we consider the model where admission status is jointly independent of department and sex, which we denote by \((A, DS)\).

In SAS, the model can be fitted like this:

proc genmod data=berkeley order=data;
class D S A;
model count = D S A D*S / dist=poisson link=log lrci type3 obstats;
run;

This model implies that the association between D and S does NOT depend on the level of the variable A. That is, the association between department and sex is independent of the rejection/acceptance decision.

\(\theta_{DS(A=reject)}=\theta_{DS(A=accept)}=\dfrac{\text{exp}(\lambda_{ij}^{DS})\text{exp}(\lambda_{i'j'}^{DS})}{\text{exp}(\lambda_{i'j}^{DS})\text{exp}(\lambda_{ij'}^{DS})}\)

Since we are assuming that the (DS) distribution is independent of A, then we are assuming that the conditional distribution of DS, given A, is the same as the unconditional distribution of DS. And equivalently, the conditional distribution of A, given DS, is the same as the unconditional distribution of A. In other words, if this model fits well, neither department nor sex tells us anything significant about admission status.

The first estimated coefficient 1.9436 for the DS associations...

D*S        DeptA   Male     1    1.9436    0.1268    1.6950    2.1921   234.84

implies that the estimated odds ratio between sex and department (specifically, A versus F) is \(\exp(1.9436) = 6.98\) with 95% CI

\((\exp(1.695), \exp(2.192))= (5.45, 8.96)\)

But, we should really check the overall fit of the model first, to determine if these estimates are meaningful.

Model Fit

The goodness-of-fit statistics indicate that the model does not fit since the "Value/df" is much larger than 1.


Criteria For Assessing Goodness Of Fit
Criterion	DF	Value	Value/DF
Deviance	11	877.0564	79.7324
Scaled Deviance	11	877.0564	79.7324
Pearson Chi-Square	11	797.7041	72.5186
Scaled Pearson X2	11	797.7041	72.5186
Log Likelihood		20074.6774

How did we get these degrees of freedom? As usual, we're comparing this model to the saturated model, and the df is the difference in the numbers of parameters involved:

\(DF = (IJK-1) - [(I-1)+(J-1)+(K-1)+(J-1)(K-1)] = (I-1)(JK-1) \)

With \(I=2\), \(J=2\), and \(K=6\) corresponding to the levels of A, S, and D, respectively, this works out to be 11.

As before, we can look at the residuals for more information about why this model fits poorly. Recall that adjusted residuals have approximately a N(0, 1) distribution (i.e., "Std Pearson Residual"). In general, we have a lack of fit if (1) we have a large number of cells and adjusted residuals are greater than 3, or (2) we have a small number of cells and adjusted residuals are greater than 2. Here is only part of the output. Notice that the absolute value of the standardized residual for the first six cells are all large (e.g., in cell 1, the value is \(-15.1793\)). Many other such residuals are rather large as well, indicating that this model fits poorly.

Evaluate the residuals


Observation Statistics
Observation	count	D	S	A	Predicted Value	Linear Predictor	Standard Error of the Linear Predictor	HessWgt	Lower	Upper	Raw Residual	Pearson Residual	Deviance Residual	Std Deviance Residual	Std Pearson Residual	Likelihood Residual	Leverage	CookD	DFBETA_Intercept	DFBETA_DDeptA	DFBETA_DDeptB	DFBETA_DDeptC	DFBETA_DDeptD	DFBETA_DDeptE	DFBETA_SMale	DFBETA_AReject	DFBETA_DDeptASMale	DFBETA_DDeptBSMale	DFBETA_DDeptCSMale	DFBETA_DDeptDSMale	DFBETA_DDeptESMale	DFBETAS_Intercept	DFBETAS_DDeptA	DFBETAS_DDeptB	DFBETAS_DDeptC	DFBETAS_DDeptD	DFBETAS_DDeptE	DFBETAS_SMale	DFBETAS_AReject	DFBETAS_DDeptASMale	DFBETAS_DDeptBSMale	DFBETAS_DDeptCSMale	DFBETAS_DDeptDSMale	DFBETAS_DDeptESMale
1	313	DeptA	Male	Reject	505.09831	6.2247531	0.0367703	505.09831	469.97739	542.84378	-192.0983	-8.547431	-9.199152	-16.3367	-15.17931	-15.55562	0.6829213	38.173694	0.1338572	1.051E-15	0	0	-2.63E-16	0	0	-0.218635	-0.734349	5.255E-16	0	0	0	2.3367482	9.518E-15	0	0	-3.51E-15	0	0	-7.166703	-5.790199	2.414E-15	0	0	0
2	512	DeptA	Male	Accept	319.90169	5.7680137	0.0395092	319.90169	296.06444	345.65816	192.09831	10.740272	9.8692317	13.948262	15.179309	14.575998	0.4993589	17.678562	0.1338572	-1.33E-15	-4.99E-16	-4.99E-16	-4.99E-16	-4.99E-16	-6.66E-16	-0.218635	0.4650965	3.328E-16	6.656E-16	6.656E-16	6.656E-16	2.3367481	-1.21E-14	-2.41E-15	-7.35E-15	-6.67E-15	-6.75E-15	-8.88E-15	-7.166703	3.6671961	1.529E-15	6.534E-15	6.441E-15	5.751E-15
3	19	DeptA	Female	Reject	66.122038	4.1915021	0.0969494	66.122038	54.679277	79.959431	-47.12204	-5.794967	-6.845121	-11.12613	-9.419201	-10.09928	0.6214932	11.205917	0.0275065	-1.152726	-7.77E-16	-7.77E-16	-7.86E-16	-7.77E-16	-1.13E-15	-0.044928	1.1527259	1.062E-15	1.074E-15	1.075E-15	1.128E-15	0.4801819	-10.4398	-3.75E-15	-1.14E-14	-1.05E-14	-1.05E-14	-1.51E-14	-1.472697	9.0890214	4.878E-15	1.054E-14	1.04E-14	9.745E-15
4	89	DeptA	Female	Accept	41.878088	3.7347627	0.0980209	41.878088	34.558213	50.748407	47.121912	7.2816446	6.3202597	8.1755761	9.4191761	8.6973682	0.402369	4.5948769	0.0275065	0.7300717	3.761E-16	3.761E-16	3.761E-16	3.761E-16	5.813E-16	-0.044928	-0.730072	-5.47E-16	-5.47E-16	-5.47E-16	-5.81E-16	0.4801807	6.6119817	1.815E-15	5.535E-15	5.027E-15	5.083E-15	7.759E-15	-1.472693	-5.756474	-2.51E-15	-5.37E-15	-5.29E-15	-5.02E-15
5	207	DeptB	Male	Reject	342.85461	5.8373065	0.0438822	342.85461	314.59903	373.64795	-135.8546	-7.337015	-7.925271	-13.59608	-12.58691	-12.93864	0.6602177	23.679966	0.0883403	-6.94E-16	0	-6.94E-16	-6.94E-16	-3.47E-16	-6.94E-16	-0.14429	1.04E-15	-0.713979	6.936E-16	1.04E-15	6.936E-16	1.5421589	-6.28E-15	0	-1.02E-14	-9.27E-15	-4.69E-15	-9.26E-15	-4.729733	8.203E-15	-3.279442	6.809E-15	1.007E-14	5.993E-15
6	353	DeptB	Male	Accept	217.14539	5.3805671	0.0462014	217.14539	198.34621	237.72635	135.85461	9.2193238	8.4461135	11.531262	12.586906	12.032087	0.4635118	10.529199	0.0883403	-1.1E-16	0	1.098E-16	0	-1.1E-16	0	-0.14429	-2.2E-16	0.4521955	0	-2.2E-16	0	1.5421589	-9.95E-16	0	1.616E-15	0	-1.48E-15	0	-4.729733	-1.73E-15	2.0770196	0	-2.13E-15	0
7	8	DeptB	Female	Reject	15.30601	2.7282455	0.2003495	15.30601	10.335325	22.667301	-7.30601	-1.867451	-2.056977	-3.312462	-3.007258	-3.128479	0.6143822	1.108357	0.0041861	1.682E-16	-0.75785	1.939E-16	1.926E-16	1.939E-16	1.981E-16	-0.006837	-1.95E-16	0.7578499	-2.47E-16	-2.31E-16	-2.39E-16	0.0730767	1.523E-15	-3.657546	2.853E-15	2.574E-15	2.62E-15	2.644E-15	-0.224123	-1.54E-15	3.4809481	-2.43E-15	-2.23E-15	-2.07E-15
8	17	DeptB	Female	Accept	9.6939907	2.2715062	0.2008702	9.6939907	6.5391536	14.37089	7.3060093	2.3465452	2.1180239	2.7143924	3.0072581	2.8325522	0.3911414	0.4469052	0.0041861	-1.3E-16	0.4799807	-1.41E-16	-1.41E-16	-1.41E-16	-1.46E-16	-0.006837	1.457E-16	-0.479981	1.769E-16	1.665E-16	1.717E-16	0.0730767	-1.18E-15	2.3164901	-2.07E-15	-1.88E-15	-1.9E-15	-1.94E-15	-0.224123	1.149E-15	-2.204642	1.737E-15	1.612E-15	1.484E-15
9	205	DeptC	Male	Reject	198.97812	5.2931949	0.0567174	198.97812	178.04404	222.3736	6.0218769	0.426903	0.4247764	0.7080436	0.7115884	0.7103146	0.6400844	0.0692709	-0.003697	2.177E-17	2.902E-17	0	2.177E-17	1.451E-17	2.902E-17	0.006038	-2.9E-17	-4.35E-17	0.0514811	-2.9E-17	-2.9E-17	-0.064534	1.971E-16	1.401E-16	0	2.909E-16	1.961E-16	3.874E-16	0.197922	-2.29E-16	-2E-16	0.5053783	-2.81E-16	-2.51E-16
10	120	DeptC	Male	Accept	126.02188	4.8364555	0.0585301	126.02188	112.36343	141.34059	-6.021879	-0.536425	-0.540785	-0.717372	-0.711589	-0.714881	0.431723	0.029591	-0.003697	9.191E-18	-4.6E-18	1.838E-17	4.596E-18	4.596E-18	0	0.006038	0	9.191E-18	-0.032605	0	0	-0.064534	8.324E-17	-2.22E-17	2.705E-16	6.142E-17	6.21E-17	0	0.1979221	0	4.222E-17	-0.320079	0	0
11	391	DeptC	Female	Reject	363.05854	5.8945641	0.0427349	363.05854	333.88785	394.77779	27.941455	1.4664278	1.4481968	2.4948336	2.5262405	2.5157016	0.6630449	0.9659997	-0.018322	1.899E-17	-9.36E-17	0.1398371	-1.58E-17	1.433E-17	-3.99E-18	0.0299254	-1E-17	1.916E-16	-0.139837	3.936E-17	3.992E-18	-0.31984	1.719E-16	-4.52E-16	2.0575648	-2.11E-16	1.936E-16	-5.33E-17	0.9809346	-7.92E-17	8.802E-16	-1.372749	3.809E-16	3.449E-17
12	202	DeptC	Female	Accept	229.94146	5.4378247	0.0451131	229.94146	210.48294	251.19886	-27.94146	-1.84264	-1.881985	-2.580183	-2.526241	-2.555081	0.4679758	0.4318154	-0.018322	9.111E-17	1.367E-16	-0.088565	9.111E-17	6.833E-17	9.111E-17	0.0299254	-9.11E-17	-2.05E-16	0.0885652	-1.14E-16	-9.11E-17	-0.31984	8.251E-16	6.595E-16	-1.303149	1.218E-15	9.233E-16	1.216E-15	0.9809347	-7.18E-16	-9.42E-16	0.8694243	-1.1E-15	-7.87E-16
13	279	DeptD	Male	Reject	255.30424	5.5424559	0.0503787	255.30424	231.29997	281.79967	23.695762	1.4830018	1.4609054	2.4622378	2.4994795	2.4864328	0.6479663	0.8845534	-0.014872	2.919E-17	0	-2.92E-17	0	-2.92E-17	0	0.0242913	0	-5.84E-17	0	0.1614174	0	-0.259622	2.644E-16	0	-4.3E-16	0	-3.94E-16	0	0.7962501	0	-2.68E-16	0	1.5620689	0
14	138	DeptD	Male	Accept	161.69576	5.0857166	0.0524112	161.69576	145.91036	179.18892	-23.69576	-1.863466	-1.912007	-2.564589	-2.49948	-2.535876	0.444168	0.3840251	-0.014872	7.395E-17	5.546E-17	7.395E-17	7.395E-17	7.395E-17	7.395E-17	0.0242913	-7.4E-17	-3.7E-17	-7.4E-17	-0.102233	-7.4E-17	-0.259622	6.698E-16	2.677E-16	1.088E-15	9.883E-16	9.993E-16	9.871E-16	0.7962502	-5.83E-16	-1.7E-16	-7.26E-16	-0.989329	-6.39E-16
15	244	DeptD	Female	Reject	229.59014	5.4362957	0.0529774	229.59014	206.94685	254.71097	14.409858	0.9510056	0.941309	1.5784537	1.5947136	1.5889501	0.644368	0.3544502	-0.008952	9.714E-17	4.215E-17	7.729E-17	0.1080507	9.486E-17	1.562E-16	0.0146225	-1.63E-16	-8.21E-17	-1.21E-16	-0.108051	-1.39E-16	-0.156284	8.797E-16	2.034E-16	1.137E-15	1.4439895	1.282E-15	2.085E-15	0.479316	-1.29E-15	-3.77E-16	-1.19E-15	-1.045628	-1.2E-15
16	131	DeptD	Female	Accept	145.40986	4.9795564	0.0549138	145.40986	130.57234	161.93344	-14.40986	-1.194986	-1.215586	-1.622204	-1.594714	-1.610208	0.4384866	0.152763	-0.008953	-1.11E-17	1.113E-17	-1.11E-17	-0.068433	-2.23E-17	-5.56E-17	0.0146225	5.565E-17	1.113E-17	3.339E-17	0.0684334	4.452E-17	-0.156284	-1.01E-16	5.371E-17	-1.64E-16	-0.914544	-3.01E-16	-7.43E-16	0.4793161	4.388E-16	5.112E-17	3.278E-16	0.6622441	3.847E-16
17	138	DeptE	Male	Reject	116.93791	4.7616431	0.0733181	116.93791	101.28541	135.00933	21.062088	1.9477076	1.8932348	3.106604	3.1959883	3.1630862	0.6286041	1.3298634	-0.01253	-2.46E-17	0	-4.92E-17	0	-9.84E-17	-4.92E-17	0.0204658	4.919E-17	0	9.838E-17	4.919E-17	0.2969142	-0.218736	-2.23E-16	0	-7.24E-16	0	-1.33E-15	-6.57E-16	0.6708529	3.878E-16	0	9.657E-16	4.76E-16	2.5655562
18	53	DeptE	Male	Accept	74.062089	4.3049038	0.0747292	74.062089	63.971469	85.744365	-21.06209	-2.447392	-2.579791	-3.368885	-3.195988	-3.298475	0.4135965	0.5541756	-0.01253	9.346E-17	4.673E-17	7.788E-17	6.231E-17	9.346E-17	9.346E-17	0.0204658	-9.35E-17	-6.23E-17	-1.25E-16	-9.35E-17	-0.188049	-0.218736	8.464E-16	2.255E-16	1.146E-15	8.327E-16	1.263E-15	1.247E-15	0.6708529	-7.37E-16	-2.86E-16	-1.22E-15	-9.04E-16	-1.624883
19	299	DeptE	Female	Reject	240.61047	5.4831793	0.0518118	240.61047	217.37632	266.32799	58.389531	3.7642437	3.6255985	6.0928851	6.3258808	6.2443737	0.6459102	5.6150979	-0.036434	3.776E-17	-1.86E-16	2.849E-17	4.013E-17	0.4195937	1.351E-16	0.0595093	-1.63E-16	9.503E-17	-6.36E-17	-1.36E-16	-0.419594	-0.636029	3.419E-16	-8.98E-16	4.193E-16	5.362E-16	5.669627	1.803E-15	1.9506733	-1.29E-15	4.365E-16	-6.24E-16	-1.32E-15	-3.625598
20	94	DeptE	Female	Accept	152.38953	5.02644	0.0537902	152.38953	137.14149	169.33293	-58.38953	-4.72996	-5.093896	-6.812612	-6.325881	-6.602426	0.4409215	2.4276557	-0.036434	1.812E-16	2.718E-16	1.359E-16	1.359E-16	-0.265748	9.059E-17	0.0595093	-9.06E-17	-2.26E-16	-1.36E-16	-9.06E-17	0.2657478	-0.636029	1.641E-15	1.312E-15	1.999E-15	1.816E-15	-3.590832	1.209E-15	1.9506734	-7.14E-16	-1.04E-15	-1.33E-15	-8.77E-16	2.2962557
21	351	DeptF	Male	Reject	228.36589	5.4309491	0.0531121	228.36589	205.78898	253.41969	122.63411	8.1151336	7.5151464	12.598896	13.604754	13.255617	0.6441967	25.777847	-0.076153	1.196E-15	-4.48E-16	0	7.474E-16	2.99E-16	0.9240428	0.1243841	-0.924043	-0.924043	-0.924043	-0.924043	-0.924043	-1.329404	1.083E-14	-2.16E-15	0	9.988E-15	4.039E-15	12.333171	4.077222	-7.285899	-4.244304	-9.071119	-8.942148	-7.984408
22	22	DeptF	Male	Accept	144.63448	4.9742098	0.0550438	144.63448	129.84299	161.111	-122.6345	-10.1971	-12.744	-17.00283	-13.6048	-15.6051	0.4382161	11.106051	-0.076153	-2.84E-16	5.68E-16	2.84E-16	-9.47E-17	9.467E-17	-0.58524	0.1243845	0.58524	0.58524	0.58524	0.58524	0.58524	-1.329408	-2.57E-15	2.741E-15	4.179E-15	-1.27E-15	1.279E-15	-7.811181	4.0772344	4.6145047	2.6881185	5.745169	5.6634853	5.0569034
23	317	DeptF	Female	Reject	208.77408	5.3412527	0.05543	208.77408	187.28133	232.73339	108.22592	7.4901924	6.9525291	11.611039	12.508961	12.194621	0.6414552	21.533862	0.8184913	-0.885183	-0.885183	-0.885183	-0.885183	-0.885183	-0.885183	0.1089309	0.8851832	0.8851832	0.8851832	0.8851832	0.8851832	14.288419	-8.016767	-4.272084	-13.0246	-11.82959	-11.96076	-11.81451	3.5706788	6.9794986	4.0658143	8.6896432	8.5660954	7.6486325
24	24	DeptF	Female	Accept	132.22611	4.8845134	0.0572835	132.22611	118.18364	147.93708	-108.2261	-9.411816	-11.59923	-15.41621	-12.50898	-14.22795	0.4338873	9.225178	-0.62732	0.5606277	0.5606277	0.5606277	0.5606277	0.5606277	0.5606277	0.1089311	-0.560628	-0.560628	-0.560628	-0.560628	-0.560628	-10.95113	5.0773916	2.7057097	8.2490839	7.4922269	7.5753036	7.4826807	3.5706851	-4.420441	-2.575069	-5.503555	-5.425307	-4.844235

In R, one way to fit the (A, DS) model is with the following command:

berk.jnt = glm(Freq~Admit+Gender+Dept+Gender*Dept, family=poisson(), data=berk.data)

This model implies that association between D and S does NOT depend on level of the variable A. That is, the association between department and sex is independent of the rejection/acceptance decision.

\(\theta_{DS(A=reject)}=\theta_{DS(A=accept)}=\dfrac{\text{exp}(\lambda_{ij}^{DS})\text{exp}(\lambda_{i'j'}^{DS})}{\text{exp}(\lambda_{i'j}^{DS})\text{exp}(\lambda_{ij'}^{DS})}\)

The first estimated coefficient 1.9436 for the DS associations implies that the estimated odds ratio between sex and department (specifically, A versus F) is \(\exp(1.9436) = 6.98\) with 95% CI

\((\exp(1.695), \exp(2.192))= (5.45, 8.96)\)

> summary(berk.jnt)

Call:
glm(formula = Freq ~ Admit + Gender + Dept + Gender * Dept, family = poisson(), 
    data = berk.data)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-12.744   -3.208   -0.058    2.495    9.869  

Coefficients:
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)       4.88451    0.05728  85.269  < 2e-16 ***
AdmitRejected     0.45674    0.03051  14.972  < 2e-16 ***
GenderMale        0.08970    0.07492   1.197   0.2312    
DeptA            -1.14975    0.11042 -10.413  < 2e-16 ***
DeptB            -2.61301    0.20720 -12.611  < 2e-16 ***
DeptC             0.55331    0.06796   8.141 3.91e-16 ***
DeptD             0.09504    0.07483   1.270   0.2040    
DeptE             0.14193    0.07401   1.918   0.0551 .  
GenderMale:DeptA  1.94356    0.12683  15.325  < 2e-16 ***
GenderMale:DeptB  3.01937    0.21771  13.869  < 2e-16 ***
GenderMale:DeptC -0.69107    0.10187  -6.784 1.17e-11 ***
GenderMale:DeptD  0.01646    0.10334   0.159   0.8734    
GenderMale:DeptE -0.81123    0.11573  -7.010 2.39e-12 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 2650.10  on 23  degrees of freedom
Residual deviance:  877.06  on 11  degrees of freedom
AIC: 1062.1

But, we should really check the overall fit of the model first, to determine if these estimates are meaningful.

Model Fit

The goodness-of-fit statistic (in the output as "Residual deviance") 877.06 indicates that the model does not fit since the "Value/df" is much larger than 1.

How did we get these degrees of freedom? As usual, we're comparing this model to the saturated model, and the df is the difference in the numbers of parameters involved:

\(DF = (IJK-1) - [(I-1)+(J-1)+(K-1)+(J-1)(K-1)] = (I-1)(JK-1) \)

With \(I=2\), \(J=2\), and \(K=6\) corresponding to the levels of A, S, and D, respectively, this works out to be 11.

As before, we can look at the residuals for more information about why this model fits poorly. Recall that adjusted residuals have approximately a N(0, 1) distribution. In general, we have a lack of fit if (1) we have a large number of cells and adjusted residuals are greater than 3, or (2) we have a small number of cells and adjusted residuals are greater than 2. Here is only part of the output. Notice that the absolute value of the standardized residual for the first six cells are all large (e.g., in cell 1, the value is \(-15.18\)). Many other such residuals are rather large as well, indicating that this model fits poorly.

Evaluate the residuals

> fits = fitted(berk.jnt)
> resids = residuals(berk.jnt,type="pearson")
> h = lm.influence(berk.jnt)$hat
> adjresids = resids/sqrt(1-h)
> round(cbind(berk.data$Freq,fits,adjresids),2)
         fits adjresids
1  512 319.90     15.18
2  313 505.10    -15.18
3   89  41.88      9.42
4   19  66.12     -9.42
5  353 217.15     12.59
6  207 342.85    -12.59
...

Next, let us look at what is often the most interesting model of conditional independence.

10.2.4 - Conditional Independence

With three variables, there are three ways to satisfy conditional independence. The assumption is that two variables are independent, given the third. For example, if \(A\) and \(B\) are conditionally independent, given \(C\), which we denote by \((AC, BC)\), then the conditional distribution of \((AB)\), given \(C\), can be factored into the product of the two conditional marginals, given \(C\).

Main assumptions

(these are stated for the case \((AC,BC)\) but hold in a similar way for \((AB, BC)\) and \((AB, AC)\) as well):

The \(N = IJK\) counts in the cells are assumed to be independent observations of a Poisson random variable, and
there is no partial interaction between \(A\) and \(B\): \(\lambda_{ij}^{AB} =0\), for all \(i, j\), and
there is no three-way interaction: \(\lambda_{ijk}^{ABC}=0\) for all \(i, j, k\).

Note the constraints above are in addition to the usual set-to-zero or sum-to-zero constraints (present even in the saturated model) imposed to avoid overparameterization.

Model Structure

\(\log(\mu_{ijk})=\lambda+\lambda_i^A+\lambda_j^B+\lambda_k^C +\lambda_{ik}^{AC}+\lambda_{jk}^{BC}\)

In the example below, we consider the model where admission status and department are conditionally independent, given sex, which we denote by \((AS, DS)\).

In SAS, this model can be fitted as:

proc genmod data=berkeley order=data;
class D S A;
model count = D S A A*S D*S / dist=poisson link=log lrci type3 obstats;
run;

Model Fit:

The goodness-of-fit statistics indicate that the model does not fit.


Criteria For Assessing Goodness Of Fit
Criterion	DF	Value	Value/DF
Deviance	10	783.6070	78.3607
Scaled Deviance	10	783.6070	78.3607
Pearson Chi-Square	10	715.2958	71.5296
Scaled Pearson X2	10	715.2958	71.5296
Log Likelihood		20121.4021

How did we get these DF?

\(DF = (IJK-1) - [(I-1)+(J-1)+(K-1)+(I-1)(J-1)+(J-1)(K-1)]=J(I-1)(K-1) \)

With \(I=2\), \(J=2\), and \(K=6\) corresponding to the levels of A, S, and D, respectively, this works out to be 10.

So where is the lack of fit? As before, we look at residuals. For example, the adjusted residual for the first cell is \(-12.1747\), a great deviation from zero.

We can also evaluate overall the individual parameters and their significance:


LR Statistics For Type 3 Analysis
Source	DF	Chi-Square	Pr > ChiSq
D	5	313.04	<.0001
S	1	333.24	<.0001
A	1	283.32	<.0001
S*A	1	93.45	<.0001
D*S	5	1220.61	<.0001

This is like an ANOVA table in ANOVA and regression models. All parameters are significantly different from zero. That is they contribute significantly in describing the relationships between our variables, but the overall lack of fit of the model suggests that they are not sufficient.

In R, the (AS, DS) model can be fitted with

berk.cnd = glm(Freq~Admit+Gender+Dept+Admit*Gender+Dept*Gender, family=poisson(), data=berk.data)

Model Fit:

The goodness-of-fit statistics indicate that the model does not fit, e.g., Residual deviance: 783.6 on 10 degrees of freedom

How did we get these DF?

\(DF = (IJK-1) - [(I-1)+(J-1)+(K-1)+(I-1)(J-1)+(J-1)(K-1)]=J(I-1)(K-1) \)

With \(I=2\), \(J=2\), and \(K=6\) corresponding to the levels of A, S, and D, respectively, this works out to be 10.

So where is the lack of fit? As before, we look at residuals. For example, the adjusted residual for the first cell is \(-12.17\), a great deviation from zero.

> fits = fitted(berk.cnd)
> resids = residuals(berk.cnd,type="pearson")
> h = lm.influence(berk.cnd)$hat
> adjresids = resids/sqrt(1-h)
> round(cbind(berk.data$Freq,fits,adjresids),2)
         fits adjresids
1  512 367.28     12.17
2  313 457.72    -12.17
3   89  32.78     12.13
4   19  75.22    -12.13
5  353 249.31      9.91
6  207 310.69     -9.91
...

We can also evaluate overall the individual parameters and their significance. However, the ANOVA table R reports for glm-fitted models uses the sequential sum of squares, which depend on the order terms are added to the model.

> anova(berk.cnd, test="LR")
Analysis of Deviance Table

Model: poisson, link: log

Response: Freq

Terms added sequentially (first to last)


             Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
NULL                            23    2650.10              
Admit         1   230.03        22    2420.07 < 2.2e-16 ***
Gender        1   162.87        21    2257.19 < 2.2e-16 ***
Dept          5   159.52        16    2097.67 < 2.2e-16 ***
Admit:Gender  1    93.45        15    2004.22 < 2.2e-16 ***
Gender:Dept   5  1220.61        10     783.61 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

10.2.5 - Homogeneous Association

The homogeneous associations model is also known as a model of no-3-way interactions. Denoted by (AB, AC, BC), the only restriction this model imposes over the saturated model is that each pairwise conditional association doesn't depend on the value the third variable is fixed at. For example, the conditional odds ratio between \(A\) and \(B\), given \(C\) is fixed at its first level, must be the same as the conditional odds ratio between \(A\) and \(B\), given \(C\) is fixed at its second level, and so on.

Main assumptions

The N = IJK counts in the cells are assumed to be independent observations of a Poisson random variable, and
there is no three-way interaction among the variables: \(\lambda_{ijk}^{ABC}=0\) for all \(i, j, k\).

Model Structure

\(\log(\mu_{ijk})=\lambda+\lambda_i^A+\lambda_j^B+\lambda_k^C+\lambda_{ij}^{AB}+\lambda_{jk}^{BC}+\lambda_{ik}^{AC}\)

In terms of the Berkeley example, this model implies that the conditional association between department and sex does not depend on the fixed value of admission status, that the conditional association between sex and admission status does not depend on the fixed value of department, and the conditional association between department and admission status does not depend on the fixed value of sex.

Does this model fit? Even this model doesn't fit well, but it seems to fit better than the previous models. The deviance statistic is \(G^2= 20.2251\) with df= 5, and the Value/df is 4.0450.

Stop and Think!

Can you figure out DF?

Since the only terms that separate this model from the saturated one are those for the three-way interactions, the degrees of freedom must be \((I-1)(J-1)(K-1)\), which is \((2-1)(2-1)(6-1)=5\) in this example.

In SAS, this model can be fitted as:

proc genmod data=berkeley order=data;
class D S A;
model count = D S A D*S D*A S*A / dist=poisson link=log lrci type3 obstats;
run;

Are all terms in the model significant (e.g. look at "Type 3 Analysis output"); recall we need to use option "type3". For example, here is the ANOVA-like table that shows that SA association does not seem to be significant,


LR Statistics For Type 3 Analysis
Source	DF	Chi-Square	Pr > ChiSq
D	5	360.23	<.0001
S	1	252.58	<.0001
A	1	303.43	<.0001
D*S	5	1128.70	<.0001
D*A	5	763.40	<.0001
S*A	1	1.53	0.2159

Here is part of the output from the "Analysis of Parameter Estimates" given the values for all the parameters,


Analysis Of Maximum Likelihood Parameter Estimates
Parameter			DF	Estimate	Standard Error	Likelihood Ratio 95% Confidence Limits		Wald Chi-Square	Pr > ChiSq
Intercept			1	3.1374	0.1567	2.8166	3.4322	400.74	<.0001
D	DeptA		1	1.1356	0.1820	0.7858	1.5007	38.94	<.0001
D	DeptB		1	-0.3425	0.2533	-0.8525	0.1450	1.83	0.1763
D	DeptC		1	2.2228	0.1649	1.9102	2.5579	181.77	<.0001
D	DeptD		1	1.7439	0.1682	1.4241	2.0848	107.52	<.0001
D	DeptE		1	1.4809	0.1762	1.1439	1.8361	70.64	<.0001
D	DeptF		0	0.0000	0.0000	0.0000	0.0000	.	.
S	Male		1	-0.0037	0.1065	-0.2126	0.2048	0.00	0.9720
S	Female		0	0.0000	0.0000	0.0000	0.0000	.	.
A	Reject		1	2.6246	0.1577	2.3270	2.9467	276.88	<.0001
A	Accept		0	0.0000	0.0000	0.0000	0.0000	.	.
D*S	DeptA	Male	1	2.0023	0.1357	1.7394	2.2717	217.68	<.0001
D*S	DeptA	Female	0	0.0000	0.0000	0.0000	0.0000	.	.
D*S	DeptB	Male	1	3.0771	0.2229	2.6595	3.5364	190.63	<.0001
D*S	DeptB	Female	0	0.0000	0.0000	0.0000	0.0000	.	.
D*S	DeptC	Male	1	-0.6628	0.1044	-0.8679	-0.4587	40.34	<.0001
D*S	DeptC	Female	0	0.0000	0.0000	0.0000	0.0000	.	.
D*S	DeptD	Male	1	0.0440	0.1057	-0.1633	0.2513	0.17	0.6774
D*S	DeptD	Female	0	0.0000	0.0000	0.0000	0.0000	.	.
D*S	DeptE	Male	1	-0.7929	0.1167	-1.0226	-0.5652	46.19	<.0001
D*S	DeptE	Female	0	0.0000	0.0000	0.0000	0.0000	.	.
D*S	DeptF	Male	0	0.0000	0.0000	0.0000	0.0000	.	.
D*S	DeptF	Female	0	0.0000	0.0000	0.0000	0.0000	.	.
D*A	DeptA	Reject	1	-3.3065	0.1700	-3.6510	-2.9834	378.38	<.0001
D*A	DeptA	Accept	0	0.0000	0.0000	0.0000	0.0000	.	.
D*A	DeptB	Reject	1	-3.2631	0.1788	-3.6240	-2.9220	333.12	<.0001
D*A	DeptB	Accept	0	0.0000	0.0000	0.0000	0.0000	.	.
D*A	DeptC	Reject	1	-2.0439	0.1679	-2.3842	-1.7247	148.24	<.0001
D*A	DeptC	Accept	0	0.0000	0.0000	0.0000	0.0000	.	.
D*A	DeptD	Reject	1	-2.0119	0.1699	-2.3559	-1.6884	140.18	<.0001
D*A	DeptD	Accept	0	0.0000	0.0000	0.0000	0.0000	.	.
D*A	DeptE	Reject	1	-1.5672	0.1804	-1.9300	-1.2213	75.44	<.0001
D*A	DeptE	Accept	0	0.0000	0.0000	0.0000	0.0000	.	.
D*A	DeptF	Reject	0	0.0000	0.0000	0.0000	0.0000	.	.
D*A	DeptF	Accept	0	0.0000	0.0000	0.0000	0.0000	.	.
S*A	Male	Reject	1	0.0999	0.0808	-0.0582	0.2588	1.53	0.2167
S*A	Male	Accept	0	0.0000	0.0000	0.0000	0.0000	.	.
S*A	Female	Reject	0	0.0000	0.0000	0.0000	0.0000	.	.
S*A	Female	Accept	0	0.0000	0.0000	0.0000	0.0000	.	.
Scale			0	1.0000	0.0000	1.0000	1.0000

Note:The scale parameter was held fixed.

Recall, we are interested in the highest-order terms, thus two-way associations here, and they correspond to log-odds ratios. For example, the first coefficient 0.0999, reported in the row beginning with "S*A, male, reject", is the conditional log-odds ratio between sex and admission status. The interpretation is as follows: for a fixed department, the odds a male is rejected is \(\exp(0.0999)=1.10506\) times the odds that a female is rejected.

Although the department is fixed for this interpretation (so the comparison is among individuals applying to the same department), it doesn't matter which department we're focusing on; they all lead to the same result under this model. However, this model does not fit well, so we can't really rely on the inferences based on this model.

In R, here is one way of fitting this model (note that this syntax will also include all first-order terms automatically):

berk.ha = glm(Freq~(Admit+Gender+Dept)^2, family=poisson(), data=berk.data)

Here is part of the summary output that gives values for all the parameter estimates:

> summary(berk.ha)
...
Coefficients:
                          Estimate Std. Error z value Pr(>|z|)    
(Intercept)               3.137358   0.156723  20.019  < 2e-16 ***
AdmitRejected             2.624559   0.157728  16.640  < 2e-16 ***
GenderMale               -0.003731   0.106460  -0.035    0.972    
DeptA                     1.135552   0.181963   6.241 4.36e-10 ***
DeptB                    -0.342489   0.253251  -1.352    0.176    
DeptC                     2.222782   0.164869  13.482  < 2e-16 ***
DeptD                     1.743872   0.168181  10.369  < 2e-16 ***
DeptE                     1.480918   0.176194   8.405  < 2e-16 ***
AdmitRejected:GenderMale  0.099870   0.080846   1.235    0.217    
AdmitRejected:DeptA      -3.306480   0.169982 -19.452  < 2e-16 ***
AdmitRejected:DeptB      -3.263082   0.178784 -18.252  < 2e-16 ***
AdmitRejected:DeptC      -2.043882   0.167868 -12.176  < 2e-16 ***
AdmitRejected:DeptD      -2.011874   0.169925 -11.840  < 2e-16 ***
AdmitRejected:DeptE      -1.567174   0.180436  -8.685  < 2e-16 ***
GenderMale:DeptA          2.002319   0.135713  14.754  < 2e-16 ***
GenderMale:DeptB          3.077140   0.222869  13.807  < 2e-16 ***
GenderMale:DeptC         -0.662814   0.104357  -6.351 2.13e-10 ***
GenderMale:DeptD          0.043995   0.105736   0.416    0.677    
GenderMale:DeptE         -0.792867   0.116664  -6.796 1.07e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 2650.095  on 23  degrees of freedom
Residual deviance:   20.204  on  5  degrees of freedom
AIC: 217.26

Recall, we are interested in the highest-order terms, thus two-way associations here, and they correspond to log-odds ratios. For example, the first coefficient 0.0999, reported in the row beginning with "AdmitRejected:GenderMale", is the conditional log-odds ratio between sex and admission status. The interpretation is as follows: for a fixed department, the odds a male is rejected is \(\exp(0.0999)=1.10506\) times the odds that a female is rejected.

Although the department is fixed for this interpretation (so the comparison is among individuals applying to the same department), it doesn't matter which department we're focusing on; they all lead to the same result under this model. However, this model does not fit well (deviance statistic of 20.204 with 5 df), so we can't really rely on the inferences based on this model.

10.2.6 - Model Selection

According to our usual approach for hierarchical models, where one is a special case of the other, we can use a likelihood ratio test to measure the reduction in fit of the smaller model (null hypothesis), relative to the larger model (alternative hypothesis). The degrees of freedom for these tests are the difference between the numbers of parameters involved between the two models.

Below is a summary of all possible models for the Berkeley admissions example data, with complete independence (most restrictive) at the top the saturated model at the bottom. The df, \(G^2\), and p-value columns correspond to the deviance (goodness-of-fit) test for each model compared with the saturated model.

Model	df	\(G^2\)	p-value
(D, S, A)	16	2097.671	< .001
(DS, A)	11	877.056	< .001
(D, SA)	15	2004.222	< .001
(DA, S)	11	1242.350	< .001
(DS, SA)	10	783.607	< .001
(DS, DA)	6	21.736	< .001
(DA, SA)	10	1148.901	< .001
(DS, DA, SA)	5	20.204	< .001
(DSA)	0	0.00

Based on these results, the saturated model would be preferred because any reduced model has a significantly worse fit (all p-values are significant). However, if a reduced model would have been acceptable, relative to the saturated one, we could continue to test further reductions with likelihood ratio tests.

Likelihood Ratio Tests

Suppose the model of homogeneous association (HA) had been insignificantly different from the saturated model. We would then have preferred the HA model because it has fewer parameters and is easier to work with overall. And to test additional reductions, we could use the likelihood ratio test with the HA model as the alternative hypothesis (instead of the saturated one).

For example, to test the (DS, DA) model, which assumes sex and admission status are conditionally independent, given department, the hypotheses would be \(H_0\): (DS, DA) versus \(H_a\): (DS, DA, SA). The test statistic would be twice the difference between their log-likelihood values but could be computed directly from the deviance statistics above:

\(G^2=783.607-20.204 =763.4\)

Relative to a chi-square distribution with \(10-5=5\) degrees of freedom, this is would highly significant (p-value less than .001). If, however, this conditional independence model hadn't been rejected, we could place into the role of the alternative hypothesis to test the further reduced joint independence model (DS, A), and so on. As we've seen with our earlier models, this approach works for any full versus reduced model comparison.

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility

Dept.	Males admitted	Males rejected	Females admitted	Females rejected
A	512	313	89	19
B	353	207	17	8
C	120	205	202	391
D	139	279	131	244
E	53	138	94	299
F	22	351	24	317

Dept.	Males admitted	Males rejected	Females admitted	Females rejected
A	512	313	89	19
B	353	207	17	8
C	120	205	202	391
D	139	279	131	244
E	53	138	94	299
F	22	351	24	317

Dept.	Males admitted	Males rejected	Females admitted	Females rejected
A	512	313	89	19
B	353	207	17	8
C	120	205	202	391
D	139	279	131	244
E	53	138	94	299
F	22	351	24	317