10.2 - Correlated Residuals

Note! This section is for illustration of the origin of the covariance structure, by capturing the residuals for each time period and looking at the simple correlations for pairs of time periods. Note that the procedure used in this section is NOT what we would ordinarily do in conducting a repeated measures analysis. The work done here is automatically done for you by SAS and the results appear in the off-diagonals of the variance-covariance (R block) matrix.
The other thing to note is that our work with repeated measures covariance structures is shown in SAS. Minitab currently does not offer programming to accommodate various covariance structures, opting instead to treat repeated measures as 'split plot in time' (which assumes compound symmetry)

If we look at the ANOVA mixed model in general terms, we have:

Model: response = fixed effects + random effects + residual

In the case of repeated measures, the residual consists of a matrix of values. The diagonals of this matrix are the residual variances at each time point. The off diagonals are the covariances between successive time points. The variance-covariance matrix can be expressed as follows; this helps visualize the repeated measures model:

\(\Sigma_i=\begin{bmatrix}
\sigma_{1}^2 & \sigma_{12} & \ldots & \sigma_{1n_i}\\
\sigma_{21} & \sigma^2_{2} & &\sigma_{2n_i}\\
\vdots & & \ddots & \vdots\\
\sigma_{n_i1} & \sigma_{n_i2} & \ldots & \sigma^2_{p}
\end{bmatrix}\)

The structure shown above is the Unstructured covariance structure. To show the origin of the correlations, we can generate the matrix of correlation coefficients from a hypothetical dataset (Mixed Unstructured Sas Code) using SAS (Remember, Minitab currently does not offer programming to accomodate various covariance structures).

Here in this example dataset (Repeated Measures Example Data), there are 3 levels of a single treatment. Subjects are assigned a treatment level at random (CRD) and then are measured at three time points.

data rmanova;
input trt $ time subject resp;
datalines;
A    1    1   10
A    1    2   12
A    1    3   13
A    2    1   16
A    2    2   19
A    2    3   20
A    3    1   25
A    3    2   27
A    3    3   28
B    1    4   12
B    1    5   11
B    1    6   10
B    2    4   18
B    2    5   20
B    2    6   22
B    3    4   25
B    3    5   26
B    3    6   27
C    1    7   10
C    1    8   12
C    1    9   13
C    2    7   22
C    2    8   23
C    2    9   22
C    3    7   31
C    3    8   34
C    3    9   33
;

We can run a simple model and obtain the residuals:

/*2-factor factorial for trt and time - saving resifuals */
proc mixed data=rmanova method=type3;
class trt time subject;
model resp=trt time trt*time / ddfm=kr outpm=outmixed;
title 'Two_factor_factorial';
run; title;

Type 3 Tests of Fixed Effects

Effects Num DF Den DF F Value Pr > F
trt 2 18 14.52 0.002
time 2 18 292.72 < .0001
trt*time 4 18 4.67 0.002
/*re-organize the residuals to (unstacked data for correlation) */
data one; set outmixed; where times=1; time1=resid; keep time1; run;
data one; set outmixed; where times=1; time1=resid; keep time1; run;
data one; set outmixed; where times=1; time1=resid; keep time1; run;
data one; set outmixed; where times=1; time1=resid; keep time1; run;
data one; set outmixed; where times=1; time1=resid; keep time1; run;
title;

proc corr data=corrcheck nosample; var time1 time2 time3; run;

The residuals then are:

Obs time1 time2 time3
1 -1.66667 -2.33333 -1.66667
2 0.33333 0.66667 0.33333
3 1.33333 1.66667 1.33333
4 1.00000 -2.00000 -1.00000
5 0.00000 0.00000 0.00000
6 -1.00000 2.000000 1.000000
7 -1.66667 -0.33333 -1.66667
8 0.33333 0.66667 1.33333
9 0.33333 0.66667 0.33333

And the correlations between time points are:

The CORR Procedure

3 Variables: time1 time2 time3

Pearson Correlation Coefficients, n=9

Prob > |r| under H0: Rho=0

  time1 time2 time3
time1 1.00000 0.19026 0.55882
    0.6239 0.1178
time2 0.19026 1.00000 0.83239
  0.6239   0.0054
time3 0.55882 0.83239 1.00000
  0.1178 0.0054  

We can now see how to work with these correlations in repeated measures analysis in proc mixed.

/*Mixed Model with Repeated Measures - Unstructured*/
proc mixed data=rmanova
class trt time subject;
model resp=trt time trt*time / ddfm=kr;
repeated time/subject=subject type=un rcorr;
run; title 'Unstructured'; run;
title;run;

The repeated statement specifies the repeated variable, and the option of subject= lets you specify what units the repeated measures are made on. The type= option is where you can specify one of many types of structures for these correlations. Here we specified the Unstructured covariance structure and obtain the same correlations that we generated with simple statistics.

The mixed Procedure

estimated R Correlation

Matrix for subject 1

Row Col1 Col2 Col3
1 1.0000 0.1903 0.5588
2 0.1903 1.0000 0.8324
3 0.5588 0.8324 1.0000

Finding the best covariance structure is much of the work in modeling repeated measures. We generally consider a subset of candidate structures as we enter into a repeated measures analysis. These include UN (Unstructured), CS (Compound Symmetry), AR(1) (Autoregressive lag 1) – if time intervals are evenly spaced, or SP(POW) (Spatial Power) – if time intervals are unequally spaced.

The decision on which covariance structure is best, we use information criteria, automatically generated by proc mixed:

Fit Statistics

-2 Res Log Likelihood 63.0
AIC (smaller is better) 75.0
AICC (smaller is better) 82.6
BIC (smaller is better) 76.2

Smaller or more negative values indicate a better fit to the data. The process amounts to trying various candidate structures and then selecting the covariance structure producing the smallest or most negative values. The various information criteria listed are usually similar in value, but I tend to focus on the AICC for small sample sizes.