5.8 - Example: Mixed Model

Example 5 - 3 Section

Consider the following experimental setting: “A study of teacher self-rating of classroom control is made where investigators want to compare Eastern vs. Western US regions, the type of school (Public vs. Private) and 2 selected Teachers at each school. Each teacher submits scores from 2 classes that they teach.”

East vs West US makes sense as a fixed effect, and so does private vs. public schools.However, the investigators are probably not interested in testing for significant differences among individual teachers, but more realistically they would be interested in how much variation there is among teachers (a random effect).

So this ANOVA as a mixed model, one that includes both fixed and random effects. In this example, we would include teacher as a random effect nested within the factorial (fixed effect) treatment combinations effects of Region and School type.

Let’s say the data (Schools Data) were as follows:

region school_type teacher class SR_score
EastUS Private 1 1 81
EastUS Private 1 2 85
EastUS Private 2 1 88
EastUS Private 2 2 89
EastUS Public 1 1 68
EastUS Public 1 2 72
EastUS Public 2 1 78
EastUS Public 2 2 75
WestUS Private 1 1 88
WestUS Private 1 2 90
WestUS Private 2 1 91
WestUS Private 2 2 89
WestUS Public 1 1 94
WestUS Public 1 2 97
WestUS Public 2 1 93
WestUS Public 2 2 89

  Using SAS

In SAS we would set up the ANOVA as:

proc mixed data=school covtest method=type3;
class region school_type teacher class;
model sr_score=region school_type region*school_type;
random teacher(region*school_type);

Here we see that the fixed effects appear in the model statement, and the nested random effect appears in the random statement.

We get the following partial output:

Type 3 Analysis of Variance
Source DF Sum of Squares Mean Square Expected Mean Square Error Term F Value Pr > F
region 1 564.06 564.06 Var(Residual) + 2 Var(teach(region*school)) +  Q(region, region*school_type) MS(teach(region*school)) 24.07 0.0080
school_type 1 76.56 76.56 Var(Residual) + 2 Var(teach(region*school)) + Q(school_type, region*school_type) MS(teach(region*school)) 3.27 0.1450
region*school_type 1 264.062 264.06 Var(Residual) + 2 Var(teach(region*school)) + Q(region*school_type) MS(teach(region*school)) 1.27 0.0284
teach(region*school) 4 93.75 23.44 Var(Residual) + 2 Var(teach(region*school)) MS(Residual) 5.00 0.0257
Residual 8 37.50 4.69 Var(Residual)      

The results for hypothesis tests for the fixed effects appear as:

Type 3 Tests of Fixed Effects
Effect Num DF Den DF F Value  Pr > F
region 1 4 24.07 0.080
school_type 1 4 3.27 0.1450
region*school_type 1 4 11.27 0.0284

and a hypothesis test for the random effect appears as:

Covariance Parameter Estimates
Cov Parm Estimate Standard Error Z Value Pr Z
teach(region*school) 9.3750 8.3689 1.12 0.2626
Residual 4.6875 2.3438 2.00 0.0228

The last part of the output shown above is produced by the “covtest” option in the proc mixed statement and indicates that the \(H_0 \colon \sigma_{teacher}^{2} = 0\) cannot be rejected \(\alpha = 0.05\) (the p-value was 0.2626). It is important to point out here that this result is inconsistent with the result that we see in the ANOVA table above, where the F-test for teacher(region*school_type) is significant (P > F = 0.0257). The reason is that the covtest option is using the Wald Z test which is dubious at best. It is very un-trustworthy with small numbers of levels of the random factor.

So how should we test the Null Hypothesis for the significance of the random effect? All the advice that I have been able to gather on this topic points to shying away from using the covtest option for testing the significance of a random effect unless you have very large numbers of levels of the random effect, and instead relying on the traditional F test from the ANOVA table. I included the covtest option here to illustrate this problem in working with mixed models.

Given that the region*school_type interaction is significant, we could go ahead and use the PLM procedure with:

lsmeans region*school_type / adjust=tukey lines;

to generate the Tukey mean comparisons and produce the groupings (letterings) to identify what means differ significantly.

  Using Minitab

In Minitab, specifying the mixed model is a little different.

In Stat > ANOVA > General Linear Model, we complete the dialog box:

general linear model dialog box in Minitab

In the next panel we can create interaction terms. At this point all factors are assumed to fixed effects. You will get a prompt:

general linear model dialog box in Minitab

To create nested terms and to specify which effects are random effects, we have to use a different tab:

general linear model dialog box in Minitab
general linear model dialog box in Minitab

This would correspond to:

Model SR_score=region school_type region*school_type;

Random teacher(region school*type);

in SAS code.

Minitab Output for the mixed model:

General Linear Model: SR_score versus region, school_type, teacher

Factor Type Levels Values
region fixed 2 EastUS, WestUS
school_type fixed 2 Private, Public
teacher(region school_type) random 8 1, 2, 1, 2, 1, 2, 1, 2
Analysis of Variance for SR_score, using Adjusted SS for Tests
Source DF Seq SS Adj SS Adj MS F P
region 1 564.06 564.06 564.06 24.07 0.008
school_type 1 76.56 76.56 76.56 3.27 0.145
region*school_type 1 264.06 264.06 264.06 11.27 0.028
teacher(region schoo_type) 4 93.75 93.75 23.44 5.00 0.026
Error 8 37.50 37.50 4.69    
Total 15 1035.94        

S = 2.16506  R-Sq = 96.38%  R-Sq(adj) = 93.21%

Minitab’s results are in agreement with SAS Proc Mixed.