Printer-friendly versionPrinter-friendly version

Example - Job Satisfaction

Blue collar workers job satisfaction from large scale investigation in Denmark in 1968 (Andersen, 1985). Three variables are (1) Worker (W) job satisfaction (low, high), (2) Supervisor (S) satisfaction (low, high) and (3) Quality of management (M) (bad, good).

References: See Categorical Data Analysis Using SAS (2000), Sec. 16.3.5 and collar.sas and/or collar.R (you will need collart.txt for the R code).

Here is a table of observed and fitted values for selected models:

Manage
Super
Worker
nijk
M,S,W
MS, W
MS, MW
MS, MW, SW
bad
low
low
103
50.15
71.78
97.16
102.26
bad
low
high
87
82.59
118.22
92.84
87.74
bad
high
low
32
49.59
27.96
37.84
32.74
bad
high
high
42
81.67
46.04
36.16
41.26
good
low
low
59
85.10
63.47
51.03
59.74
good
low
high
109
140.15
104.53
116.97
108.26
good
high
low
78
84.15
105.79
85.97
77.26
good
high
high
205
138.59
174.21
197.28
205.74

Notice that as you add more parameters in your model, you get the closer fit of the expected to the observed values; this is to be expected since the saturated model will give us the perfect fit. But remember, our goal is to find the simplest possible model that explains the most variability in the data; i.e., the parsimonious model.

Let's see how we can assess the fit of each model and compare it to other models in order to render our final decision.

Chi-squared goodness-of-fit tests

For testing the overall fit of the model.

H0 : The fitted model (M0) fits well
HA : The saturated model (MA) fits well

L0 = loglikelihood for the fitted model
LA = loglikelihood for the saturated model

G2 = −2(L0LA)
df = df0dfA = # cells - # unique parameters
df = # unique parameters in MA - # unique parameters in M0.

Model
df
G2
p-value
X2
p-value
(M, S, W)
4
118.00
< 0.001
128.09
< 0.001
(MS, W)
3
35.60
< 0.001
35.72
< 0.001
(MW, S)
3
87.79
< 0.001
85.02
< 0.001
(M, WS)
3
102.11
< 0.001
99.09
< 0.001
(MW, SW)
2
71.90
< 0.001
70.88
< 0.001
(MS, MW)
2
5.39
0.07
5.41
0.07
(MS, WS)
2
19.71
< 0.001
19.88
< 0.001
(MW, SW, MS)
1
.065
0.80
.069
0.80

These are global tests. All the assumptions and properties about the chi-squared and deviance statistics that we have learned thus far hold. Based on these values there are two reasonably well fitted models: conditional independence of S and W given M, (MS, MW), and homogeneous associations model, (MS,SW,MS).

Residuals

The residuals can be used to assess how well are the cells fitted by the model. A good model should have small residuals.

As before we can consider the Pearson residuals, adjusted residuals, and deviance residuals; we defined those in Lesson 2 -- recall that you can always use the "Search Engine" for the course to find certain info. For adjusted residuals, for example, if the model holds these should be close to zero.

There is a lack of fit if:

  1. There are a few cells (small N) and adjusted residuals are greater than 2 or
  2. The sample is large (large N) and adjusted residuals are greater than 3

Here is a table with observed values, fitted values for the two models of interest and their adjusted residuals.

       
(MS, MW)
(MS, MW, SW)
Manage
Super
Worker
nijk
\(\hat{\mu}_{ijk}\)
adj res
\(\hat{\mu}_{ijk}\)
adj res
bad
low
low
103
97.16
1.60
102.26
.25
bad
low
high
87
92.84
-1.60
87.74
-.25
bad
high
low
32
37.84
-1.60
32.74
-.25
bad
high
high
42
36.16
1.60
41.26
.25
good
low
low
59
51.03
1.69
59.74
-.25
good
low
high
109
116.97
-1.69
108.26
.25
good
high
low
78
85.97
1.69
77.26
.25
good
high
high
205
197.28
-1.69
205.74
-.25

The reported residuals are really pretty small, and both models look good. Is there additional information that could help us to be a bit more confident about our decision to accept or reject the certain model?

sas logoRecall, in SAS, OBSTATS option in MODEL statement produces various residuals, e.g., StResChi:

                             Observation Statistics

Observation count manager super worker Pred Xbeta Std
HessWgt Lower Upper Resraw Reschi
Resdev StResdev StReschi Reslik

1 103 bad low low 97.159091 4.5763497 0.094248
97.159091 80.771732 116.8712 5.8409091 0.5925687
0.5867754 1.585496 1.6011497 1.5990147

R LogoRecall, in R, use the function residuals(). More specifically, if you use rstandard() gives you standardized deviance residuals, and rstudent() gives you likelihood residuals, e.g., rstandard(modelCI) from collar.R:

> rstandard(modelCI)
        1         2         3         4         5         6         7         8
 1.585496 -1.618395 -1.645241  1.560709  1.645956 -1.706941 -1.714344  1.676040

We can, for example, compare these two models to each other via the so called partial association test.

Partial Association (PA) Tests

H0 :   there is no partial association of two variable given the third, or

         conditional (partial) odds ratios equal 1, or

         two-way interaction term equal zero, e.g. \(\lambda_{jk}^{SW}=0\)

HA :   there is partial association of two variable given the third, or

         conditional (partial) odds ratios is NOT equal 1, or

         two-way interaction term is NOT equal zero, e.g. \(\lambda_{jk}^{SW} \neq 0\)

To test this, we test the difference between a model that does NOT contain the partial association of interest and a more complex model that includes this effect by looking at the likelihood ratio statistics or at the difference of deviances of the two models :

For example, let

M0 the restricted model (MS, MW)
M1 the more complex model (MS, MW, SW)
\(\Delta G^2=G^2_0-G^2_1\)
Δdf = df0 df1

Notice the similarity of the partial association test to the previously discussed goodness-of-fit test for assessing the "global" fit of the model. Here we are doing a relative comparison of two models, where is in the goodness-of-fit test we always compare the fitted model to the saturated model, that is, M1 here is MA in the goodness of fit test.

Here is a tabular summary that compares each of conditional independence models to the homogeneous association model.

Model
df
G2
p
H0
Δdf
ΔG2
p-value
(MW, SW, MS)
1
.065
0.80
-
-
-
-
(MW, SW)
2
71.90
< 0.001
\(\lambda_{ij}^{MS}=0\)
1
71.835
< 0.001
(MS, MW)
2
5.39
0.07
\(\lambda_{jk}^{SW}=0\)
1
5.325
< 0.01
(MS, WS)
2
19.71
< 0.001
\(\lambda_{ik}^{M W}=0\)
1
19.645
< 0.001

From the above table, we can conclude that each of the terms significantly contributes to the better fit of the model, if that term is present, thus the homogeneous association models is a better fit than any of the conditional independence models.

Here is an animated graphical representation summarizing the fits and partial tests for all models in this example:

Seems thus far that the homogeneous model describe our data the best. We typically do inference on the chosen model in terms of odds ratios.

Confidence Intervals for Odds Ratios

Recall, the odds ratios are direct estimates of log-linear parameters.

Consider estimating the partial odds ratio for SW in our example from the homogeneous association model.

sas logoBased on SAS/GENMOD (see collar.sas and collarDIndex.sas )

\(\text{log}(\hat{\theta}_{SW(i)})=0.3872\text{ and }\hat{\theta}_{SW(i)}=\text{exp}(0.3872)=1.47\)

Recall that this is a Wald statistic, thus a

(1 − α) × 100% CI for \(\text{log}(\hat{\theta}_{SW(i)})\)

\(\text{log}(\hat{\theta}_{SW(i)}) \pm z_{\alpha/2}(ASE)\)

E.g. 95% CI for \(\text{log}(\hat{\theta}_{SW(i)})\):

0.3872 ± 1.96(0.167) = (0.056, 0.709)

and 95% CI for \(\hat{\theta}_{SW(i)}\):

(exp(0.056), exp(0.709) = (1.06, 2.03)

R LogoBased on SAS/GENMOD (see collar.sas and collarDIndex.sas )

\(\text{log}(\hat{\theta}_{SW(i)})=0.3872\text{ and }\hat{\theta}_{SW(i)}=\text{exp}(0.3872)=1.47\)

Recall that this is a Wald statistic, thus a

(1 − α) × 100% CI for \(\text{log}(\hat{\theta}_{SW(i)})\)

\(\text{log}(\hat{\theta}_{SW(i)}) \pm z_{\alpha/2}(ASE)\)

E.g. 95% CI for \(\text{log}(\hat{\theta}_{SW(i)})\):

0.3872 ± 1.96(0.167) = (0.056, 0.709)

and 95% CI for \(\hat{\theta}_{SW(i)}\):

(exp(0.056), exp(0.709) = (1.06, 2.03)

But the lower bound is very close to 1. This should raise the question for you: Is this term really significant? Could we go, after all, with a simpler model (MS,MW)?

Effects of sample size and practical significance

Of the two models in our example, which one is the ”better” model? How do I choose between the simpler more parsimonious model or the one that is more complex?

Criterion
(MS, MW)
(MS, MW, SW )
Model goodness of fit G2 =
5.39
df = 2, p = .07
.065
df = 1, p = .80

Largest adjusted residual

1.69
.25
Likelihood ratio test of \(\lambda_{jk}^{SW}=0\)
-
G2 = 5.325
df = 1, p = .03
Complexity
simpler
more complex

Do we really need the SW partial association?

It maybe a weak effect, i.e., it may not be important in a practical sense. Although it appears statistically significant which can be due to the large sample size (n = 715) for a relatively small table N = 23.

So how do we choose between these two? How do we consider the trade-off between a simple parsimonious model (practical significance) and a more complex and closer to reality model (statistical significance). This decision is often driven by the context of the problem and the above analysis we have done. But let's see some more statistics that we can use to aid us in the decision of choosing the "best" model.