# 10.2.4 - Conditional Independence

10.2.4 - Conditional IndependenceWith three variables, there are three ways to satisfy conditional independence. The assumption is that two variables are independent, given the third. For example, if \(A\) and \(B\) are conditionally independent, given \(C\), which we denote by \((AC, BC)\), then the conditional distribution of \((AB)\), given \(C\), can be factored into the product of the two conditional marginals, given \(C\).

#### Main assumptions

(these are stated for the case \((AC,BC)\) but hold in a similar way for \((AB, BC)\) and \((AB, AC)\) as well):

- The \(N = IJK\) counts in the cells are assumed to be independent observations of a Poisson random variable, and
- there is no partial interaction between \(A\) and \(B\): \(\lambda_{ij}^{AB} =0\), for all \(i, j\), and
- there is no three-way interaction: \(\lambda_{ijk}^{ABC}=0\) for all \(i, j, k\).

Note the constraints above are in addition to the usual set-to-zero or sum-to-zero constraints (present even in the saturated model) imposed to avoid overparameterization.

#### Model Structure

\(\log(\mu_{ijk})=\lambda+\lambda_i^A+\lambda_j^B+\lambda_k^C +\lambda_{ik}^{AC}+\lambda_{jk}^{BC}\)

In the example below, we consider the model where admission status and department are conditionally independent, given sex, which we denote by \((AS, DS)\).

In SAS, this model can be fitted as:

```
proc genmod data=berkeley order=data;
class D S A;
model count = D S A A*S D*S / dist=poisson link=log lrci type3 obstats;
run;
```

**Model Fit**:

The goodness-of-fit statistics indicate that the model does not fit.

Criteria For Assessing Goodness Of Fit | |||
---|---|---|---|

Criterion | DF | Value | Value/DF |

Deviance | 10 | 783.6070 | 78.3607 |

Scaled Deviance | 10 | 783.6070 | 78.3607 |

Pearson Chi-Square | 10 | 715.2958 | 71.5296 |

Scaled Pearson X2 | 10 | 715.2958 | 71.5296 |

Log Likelihood | 20121.4021 |

How did we get these DF?

\(DF = (IJK-1) - [(I-1)+(J-1)+(K-1)+(I-1)(J-1)+(J-1)(K-1)]=J(I-1)(K-1) \)

With \(I=2\), \(J=2\), and \(K=6\) corresponding to the levels of A, S, and D, respectively, this works out to be 10.

So where is the lack of fit? As before, we look at residuals. For example, the adjusted residual for the first cell is \(-12.1747\), a great deviation from zero.

We can also evaluate overall the individual parameters and their significance:

LR Statistics For Type 3 Analysis | |||
---|---|---|---|

Source | DF | Chi-Square | Pr > ChiSq |

D | 5 | 313.04 | <.0001 |

S | 1 | 333.24 | <.0001 |

A | 1 | 283.32 | <.0001 |

S*A | 1 | 93.45 | <.0001 |

D*S | 5 | 1220.61 | <.0001 |

This is like an ANOVA table in ANOVA and regression models. All parameters are significantly different from zero. That is they contribute significantly in describing the relationships between our variables, but the overall lack of fit of the model suggests that they are not sufficient.

In R, the (AS, DS) model can be fitted with

`berk.cnd = glm(Freq~Admit+Gender+Dept+Admit*Gender+Dept*Gender, family=poisson(), data=berk.data)`

**Model Fit**:

The goodness-of-fit statistics indicate that the model does not fit, e.g., Residual deviance: 783.6 on 10 degrees of freedom

How did we get these DF?

\(DF = (IJK-1) - [(I-1)+(J-1)+(K-1)+(I-1)(J-1)+(J-1)(K-1)]=J(I-1)(K-1) \)

With \(I=2\), \(J=2\), and \(K=6\) corresponding to the levels of A, S, and D, respectively, this works out to be 10.

So where is the lack of fit? As before, we look at residuals. For example, the adjusted residual for the first cell is \(-12.17\), a great deviation from zero.

```
> fits = fitted(berk.cnd)
> resids = residuals(berk.cnd,type="pearson")
> h = lm.influence(berk.cnd)$hat
> adjresids = resids/sqrt(1-h)
> round(cbind(berk.data$Freq,fits,adjresids),2)
fits adjresids
1 512 367.28 12.17
2 313 457.72 -12.17
3 89 32.78 12.13
4 19 75.22 -12.13
5 353 249.31 9.91
6 207 310.69 -9.91
...
```

We can also evaluate overall the individual parameters and their significance. However, the ANOVA table R reports for glm-fitted models uses the sequential sum of squares, which depend on the order terms are added to the model.

```
> anova(berk.cnd, test="LR")
Analysis of Deviance Table
Model: poisson, link: log
Response: Freq
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 23 2650.10
Admit 1 230.03 22 2420.07 < 2.2e-16 ***
Gender 1 162.87 21 2257.19 < 2.2e-16 ***
Dept 5 159.52 16 2097.67 < 2.2e-16 ***
Admit:Gender 1 93.45 15 2004.22 < 2.2e-16 ***
Gender:Dept 5 1220.61 10 783.61 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

This is like an ANOVA table in ANOVA and regression models. All parameters are significantly different from zero. That is they contribute significantly in describing the relationships between our variables, but the overall lack of fit of the model suggests that they are not sufficient.