10.2.6 - Model Selection

According to our usual approach for hierarchical models, where one is a special case of the other, we can use a likelihood ratio test to measure the reduction in fit of the smaller model (null hypothesis), relative to the larger model (alternative hypothesis). The degrees of freedom for these tests are the difference between the numbers of parameters involved between the two models.

Below is a summary of all possible models for the Berkeley admissions example data, with complete independence (most restrictive) at the top the saturated model at the bottom. The df, \(G^2\), and p-value columns correspond to the deviance (goodness-of-fit) test for each model compared with the saturated model.

Model	df	\(G^2\)	p-value
(D, S, A)	16	2097.671	< .001
(DS, A)	11	877.056	< .001
(D, SA)	15	2004.222	< .001
(DA, S)	11	1242.350	< .001
(DS, SA)	10	783.607	< .001
(DS, DA)	6	21.736	< .001
(DA, SA)	10	1148.901	< .001
(DS, DA, SA)	5	20.204	< .001
(DSA)	0	0.00

Based on these results, the saturated model would be preferred because any reduced model has a significantly worse fit (all p-values are significant). However, if a reduced model would have been acceptable, relative to the saturated one, we could continue to test further reductions with likelihood ratio tests.

Likelihood Ratio Tests Section

Suppose the model of homogeneous association (HA) had been insignificantly different from the saturated model. We would then have preferred the HA model because it has fewer parameters and is easier to work with overall. And to test additional reductions, we could use the likelihood ratio test with the HA model as the alternative hypothesis (instead of the saturated one).

For example, to test the (DS, DA) model, which assumes sex and admission status are conditionally independent, given department, the hypotheses would be \(H_0\): (DS, DA) versus \(H_a\): (DS, DA, SA). The test statistic would be twice the difference between their log-likelihood values but could be computed directly from the deviance statistics above:

\(G^2=783.607-20.204 =763.4\)

Relative to a chi-square distribution with \(10-5=5\) degrees of freedom, this is would highly significant (p-value less than .001). If, however, this conditional independence model hadn't been rejected, we could place into the role of the alternative hypothesis to test the further reduced joint independence model (DS, A), and so on. As we've seen with our earlier models, this approach works for any full versus reduced model comparison.