12.1 - The General Linear Mixed Model

Earlier in Lesson 9 we introduced the General Linear Model using matrix notation:

\(\mathbf{Y}=\mathbf{X}\boldsymbol{\beta}+\boldsymbol{\epsilon}\)

Notice that symbols are in bold, and this indicates that \(\mathbf{Y}\), \(\mathbf{X}\), \(\boldsymbol{\beta}\), and \(\boldsymbol{\epsilon}\) are matrices.

In this equation the design matrix, \(\mathbf{X}\) contains the fixed effects for the model. In the case of categorical factor levels for ANOVA, the columns represent coded variables, and in the case of continuous covariates for ANCOVA, the columns are the continuous variable data. For the case of the continuous variables, the regression coefficients are fixed effects as well, so the \(\mathbf{X}\) matrix contains the continuous variables.

For random effects, this model is expanded to include a matrix of the random effect variables \( \mathbf{Z} \) analogous to the \(\mathbf{X}\) for the fixed effects and a vector of variance estimates \(\boldsymbol{\gamma}\). With this addition of random effects, the General Linear Mixed Model becomes:

\(\mathbf{Y}=\mathbf{X}\boldsymbol{\beta}+\mathbf{Z}\boldsymbol{\gamma}+\boldsymbol{\epsilon}\).

When the model includes repeated measures, we are imposing a variance/covariance structure on \( \boldsymbol{\epsilon}\) so that we see that \( \boldsymbol{\epsilon}\) is normally distributed with mean of 0 and a variance specified by \( \mathbf{R} \).

Putting it all together, the final form of the General Linear Mixed Model is:

\(\mathbf{Y} = \underbrace{\overbrace{\mathbf{X}\boldsymbol{\beta}}^{\text{test}}}_{\substack{\text{Specified in the}\\ \text{MODEL statement}}} + \underbrace{\mathbf{Z}\boldsymbol{\gamma}}_{\substack{\text{Specificed in the}\\ \text{RANDOM statement}}}+\underbrace{\overbrace{\boldsymbol{\epsilon}}}^{\substack{\text{Specified in the}\\ \text{REPEATED statement for}\\ \text{non-default structures}}}_{\substack{\text{No longer required to}\\ \text{be independent}\\ \text{and homogeneous}}}\)

were we assume \(\gamma \sim N(0, \mathbf{G}) \text{ and } \boldsymbol{\epsilon} \sim N(0, \mathbf{R})\)

So we have the expected value of the response variable as:

\( E(\mathbf{Y})=\mathbf{X}\boldsymbol{\beta} \)

and the variance of the response variable is:

\( Var(\mathbf{Y})=\mathbf{Z}\mathbf{G}\mathbf{Z}^{'}+\mathbf{R}=\mathbf{V} \)

In the case where we only had fixed effects in the model and no repeated measures, the only source of random variation was \( \boldsymbol{\epsilon}\). In the case where we didn't have any fixed effects (e.g. the fully nested random effects model) we only had the estimates for \( \boldsymbol{\gamma} \) and \( \boldsymbol{\epsilon} \) we were able to compute the variance components as percentages. Finally, when we introduced the covariance structures in repeated measures, we were specifying the terms of \( \mathbf{R} \) and evaluating which covariance structure provided the best fit to the data.

As you review the materials covered in this course, try to envision wherein the General Linear Mixed Model that the various elements of the analysis fit in. I have found this to be helpful in keeping track of complex experimental designs.

The examples exercises (Lesson 14.2 through Lesson 14.8) in this lesson are provided as practice to 'dissect' and experiment and identify the elements of treatment and randomization designs that serve as a guide to developing the ANOVA model. After going through sets of questions (checklist style) you have the opportunity to identify the experimental design. Try to work through these and contribute to the discussion board as others discuss their ideas on how to identify the experimental design.