# Lesson 9: Repeated Measures Analysis

Lesson 9: Repeated Measures Analysis

## Overview

Repeated measures data comes from experiments where you take observations repeatedly over time. Under a repeated measures experiment, experimental units are observed at multiple points in time. So instead of looking at an observation at one point in time, we will look at data from more than one point in time. With this type of data we are looking at only a single response variable but measured over time.

In the univariate setting, we generally could expect the responses over time are temporally correlated. Observations collected at points in time close together are more likely to be similar to one another than observations collected far apart from one another. Essentially what we are going to do here is to treat observations collected at different points of time as if they were different variables - this is the multivariate analysis approach. You will see that there will be two distinctly different approaches that are frequently considered in this analysis. One of which involves a univariate analysis.

We will use the following experiment to illustrate the statistical procedures associated with repeated measures data.

## Example 9-1: Dog Experiment

In this experiment we had a completely randomized block experimental design that was carried out to determine the effects of 4 surgical treatments on coronary potassium in a group of 36 dogs. There are 9, 8, 9, and 10 dogs in each group, respectively.  Each dog was measured at four different points in time following one of four experimental treatments:

1. Control - no surgical treatment is applied
2. Extrinsic cardiac denervation immediately prior to treatment.
3. Bilateral thoracic sympathectomy and stellectomy 3 weeks prior to treatment.
4. Extrinsic cardiac denervation 3 weeks prior to treatment.

Coronary sinus potassium levels were measured at 1, 5, 9, and 13 minutes following a procedure called an occlusion. We are looking at the effect of the occlusion on the coronary sinus potassium levels following different surgical treatments.

#### Approaches

There are a number of approaches to consider here in order to analyze this type of data. The first of these has been proposed before the advent of modern computing so that it might be carried out using hand calculations. There are two very common historical approaches that one could take to address the analysis. They are followed by a more modern approach:

1. Split-plot ANOVA - this is perhaps the most common approach.
2. MANOVA - this is what we will focus on in this lesson.
3. Mixed Models - the more modern approach

## Notation in this Lesson

• $$Y_{ijk}$$= Potassium level for treatment i in dog j at time k
• a = Number of treatments
• $$n_{i}$$ = Number of replicates of treatment i
• $$N = n _ { 1 } + n _ { 2 } + \ldots + n _ { a }$$ = Total number of experimental units
• t = Number of observations over time

## Objectives

Upon completion of this lesson, you should be able to:

• Use a split-plot ANOVA to test for interactions between treatments and time, and the main effects for treatments and time;
• Use a MANOVA to assess test for interactions between treatments and time, and for the main effects of treatments;
• Understand why the split-plot ANOVA may give incorrect results; and
• Understand the shortcomings of the application of MANOVA to repeated measures data.

# 9.1 - Approach 1: Split-plot ANOVA

9.1 - Approach 1: Split-plot ANOVA

The Split-plot ANOVA is perhaps the most traditional approach, for which hand calculations are not too unreasonable. It involves modeling the data using the linear model shown below:

Model: $$Y_{ijk} = \mu + \alpha_i + \beta_{j(i)}+ \tau_k + (\alpha\tau)_{ik} + \epsilon_{ijk}$$

Using this linear model we are going to assume that the data for treatment i for dog j at time k is equal to an overall mean μ plus the treatment effect $$\alpha_i$$, the effect of the dog within that treatment $$\beta_{j \left( i \right)}$$, the effect of time $$τ_k$$, the effect of the interaction between time and treatment $$\left(ατ \right)_{ik}$$, and the error $$\varepsilon_{ijk}$$.

Such that:

• $$\mu$$ = overall mean
• $$\alpha_i$$ = effect of treatment i
• $$\beta_{j \left( i \right)}$$ = random effect of dog j receiving treatment i
• $$\tau_{k}$$= effect of time k
• $$\left( \alpha \tau \right)_{ik}$$ = treatment by time interaction
• $$\varepsilon_{ijk}$$ = experimental error

### Assumptions:

We are going to make the following assumptions about the data:

1. The errors $$\varepsilon_{ijk}$$ are independently sampled from a normal distribution with mean 0 and variance $$\sigma^2_{\epsilon}$$.

2. The individual dog effects $$\beta_{j \left( i \right)}$$ are are also independently sampled from a normal distribution with mean 0 and variance $$\sigma^2_{\beta}$$.

3. The effect of time does not depend on the dog; that is, there is no time by dog interaction. Generally,

we need to have this assumption otherwise the results would depend on which animal you were looking at - which would mean that we could not predict much for new animals.

With these assumptions, the random effect of dog and fixed effects for treatment and time, this is called a mixed effects model.

The analysis is carried out in this Analysis of Variance Table shown below:

ANOVA

Source d.f SS MS F
Treatment $$a - 1$$ $$SS_{\text {treat}}$$ $$\dfrac { \mathrm { SS } _ { \text { treat } } } { a - 1 }$$ $$\dfrac { \mathrm { MS } _ { \text { treat } } } { \mathrm { MS } _ { \text { error } ( a ) } }$$
Error (a) $$N - a$$ $$SS_{\text {error (a)} }$$ $$\dfrac { S S _ { \text { error } ( a ) } } { ( N - a ) }$$
Time $$t - 1$$ $$SS_{\text {time}}$$ $$\dfrac { S S _ { \text { time} } } { ( t - 1 ) }$$ $$\dfrac { \mathrm { MS } _ { \text { time} } } { \mathrm { MS } _ { \text { error } ( b ) } }$$
Treat x Time $$\left( a - 1 ) ( t - 1 \right)$$ $$SS_{\text {treat x time}}$$ $$\dfrac { S S _ { \text { treat x times } ( b ) } } { ( a - 1 ) ( t - 1 ) }$$ $$\dfrac { \mathrm { MS } _ { \text { treat x time } } } { \mathrm { MS } _ { \text { error } ( b ) } }$$
Error (b) $$\left( N - a ) ( t - 1 \right)$$ $$SS_{\text {error (b) }}$$ $$\dfrac { S S _ { \text { error } ( b ) } } { ( N - a ) ( t - 1 ) }$$
Total $$Nt - 1$$ $$SS_{\text {total}}$$

where,

a: the number of treatments

N: the total number of all experimental units

t: number of time points

The sources of the variation include treatment; Error (a); the effect of Time; the interaction between time and treatment; and Error (b). Error (a) is the effect of subjects within treatments and Error (b) is the individual error in the model.  All these add up to the total.

Sum of Squares Formulas

Here are the formulas that are used to calculate the various Sums of Squares involved:

$$\begin{array}{lll}SS_{total}& =& \sum_{i=1}^{a}\sum_{j=1}^{n_i}\sum_{k=1}^{t}Y^2_{ijk}-Nt\bar{y}^2_{...}\\SS_{treat} &= &t\sum_{i=1}^{a}n_i\bar{y}^2_{i..} - Nt\bar{y}^2_{...}\\SS_{error(a)}& =& t\sum_{i=1}^{a}\sum_{j=1}^{n_i}\bar{y}^2_{ij.} - t\sum_{i=1}^{a}n_i\bar{y}^2_{i..}\\SS_{time}& =& N\sum_{k=1}^{t}\bar{y}^2_{..k}-Nt\bar{y}^2_{...}\\SS_{\text{treat x time}} &=& \sum_{i=1}^{a}\sum_{k=1}^{t}n_i\bar{y}^2_{i.k} - Nt\bar{y}^2_{...}-SS_{treat} -SS_{time}\end{array}$$

Mean Square (MS) is always derived by dividing the Sum of Square term by the corresponding degrees of freedom.

To get the main effects for the treatment we compare the MS treatment to MS error (a)

We will compare these results with the results we get from the MANOVA, the next approach covered in this lesson.

# 9.2 - Example

9.2 - Example

## Example 9-2:

#### Using SAS

We will use the following SAS program below to illustrate this procedure.

View the video explanation of the SAS code.

#### Using Minitab

Currently not available in Minitab

#### Analysis

Run the SAS program inspecting how the program applies this procedure.

Note in the output where values of interest are located. The results are copied from the SAS output into this table here:
 Source d.f. SS MS F Treatment 3 19.923 6.641 6.00 Error (a) 32 35.397 1.106 Time 3 6.204 2.068 11.15 Interaction 9 3.440 0.382 2.06 Error (b) 96 17.800 0.185 Total 143 82.320

#### Hypotheses Tests

Now that we have the results from the analysis, the first thing that we want to look at is the interaction between treatment and time. We want to determine here if the effect of treatment depends on time. Therefore, we will start with:

1. The interaction between treatment and time, or:

$$H_0\colon (\alpha\tau)_{ik} = 0$$ for all $$i = 1,2, \dots, a;$$ $$k = 1,2, \dots, t$$

Here we need to look at the treatment by interaction term whose F-value is reported at 2.06. We want to compare this to an F-distribution with  (a - 1)(t - 1) = 9 and (N - a)(t - 1) = 96 degrees of freedom. The numerator d.f. of 9 is tied to the source variation due to the interaction, while the denominator d.f. is tied to the source of variation due to error(b).

We can reject $$H_0$$ at level alpha; if

$$F = \dfrac{MS_{\text{treat x time}}}{MS_{error(b)}} > F_{(a-1)(t-1), (N-a)(t-1), \alpha}$$

Therefore, we want to compare this to an F with 9 and 96 degrees of freedom. Here we see that this is significant with a p-value of 0.0406.

Result: We can conclude that the effect of treatment depends on time (F = 2.06; d. f. = 9, 96; p = 0.0406)

Next Steps...

• Because the interaction between treatment and time is significant, the next step in the analysis would be to further explore the nature of that interaction using something called profile plots, (we will look at this later...).
• If the interaction between treatment and time was not significant, the next step in the analysis would be to test for the main effects of treatment and time.
2. Let's suppose that we had not found a significant interaction. Let's do this so that you can see what it would look like to consider the effects of treatment.

Consider testing the null hypothesis that there are no treatment effects, or

$$H_0\colon \alpha_1 = \alpha_2 = \dots = \alpha_a = 0$$

To test this null hypothesis, we compute the F-ratio between the Mean Square for Treatment and Mean Square for Error (a). We then reject our Ho at level &alpha; if

$$F = \dfrac{MS_{treat}}{MS_{error(a)}} > F_{a-1, N-a, a}$$

Here, the numerator degrees of freedom is equal to the number of degrees of freedom a - 1 = 3 for treatment, while the denominator degrees of freedom is equal to the number of degrees of freedom N - a = 32 for Error(a).

Result: We can conclude that the treatment significantly affects the mean coronary sinus potassium over the t = 4 sampling times (F = 6.00; d. f. = 3,32; p = 0.0023).
3. Consider testing the effects of time:

$$H_0\colon \tau_1 = \tau_2 = \dots = \tau_t = 0$$

To test this null hypothesis, we compute the F-ratio between Mean Square for Time and Mean Square for Error(b). We then reject Ho at level $$\alpha$$; if

$$F = \dfrac{MS_{time}}{MS_{error(b)}} > F_{t-1, (N-a)(t-1), \alpha}$$

Here, the numerator degrees of freedom is equal to the number of degrees of freedom t - 1 = 3 for time, while the denominator degrees of freedom is equal to the number of degrees of freedom (N - a)(t - 1) = 96 for Error(b).

Result: We can conclude that coronary sinus potassium varies significantly over time (F = 11.15; d. f. = 3, 96; p < 0.0001).

# 9.3 - Some Criticisms about the Split-ANOVA Approach

9.3 - Some Criticisms about the Split-ANOVA Approach

This approach and these results assume a constant correlation between any two observations from the same dog. This assumption is unlikely because, typically, when you have repeated measurements over time, the data from the same subject at two different points of time are temporally correlated. In principle, observations that are collected at times that are close together are going to be more similar to one another than observations that are far apart.

This motivates an alternative approach, which is to treat this situation as a Multivariate Analysis of Variance problem instead of an Analysis of Variance problem.

# 9.4 - Approach 2: MANOVA

9.4 - Approach 2: MANOVA

When taking a multivariate approach, we collect the observations over time from the same dog, dog j receiving treatment i into a vector:

$$\mathbf{Y}_{ij} = \left(\begin{array}{c}Y_{ij1}\\ Y_{ij2} \\ \vdots\\ Y_{ijt}\end{array}\right)$$

We treat the data collected at different points in time as if it were data from different variables. Basically, we have a vector of observations for dog j receiving treatment i and each entry corresponds to data collected at a particular point in time.

The usual assumptions are made for a one-way MANOVA. In this case:

1. Dogs receiving treatment i have common mean vector $$\mu_{i}$$
2. All dogs have common variance-covariance matrix $$\Sigma$$
3. Data from different dogs are independently sampled
4. Data are multivariate normally distributed

### The Analysis

Step 1: Use a MANOVA to test for overall differences between the mean vectors of the four different observations and the treatments.

#### Using SAS

We will use the Dog SAS program to perform this multivariate analysis.

We use the glm procedure to analyze these data. In this case, we look at a one-way manova.  We only really have one classification variable - treatment.

The model statement includes the variables of interest on the left-hand side of the equal sign.  In this case, they are p1, p2, p3, and p4, (the potassium levels at four different points in time).  We put the explanatory variable, treatment, on the right-hand side of the equal sign.

The first manova statement tests the hypothesis that the mean vector of observations over time does not depend on treatment. The print option asks for the error of sums of squares and cross products matrix as well as the partial correlations.

The second manova statement tests for the main effects of treatment. We'll return to this later.

The third manova statement tests for the interaction between treatment and time. We'll also return to this later.

Right now, the result that we want to focus on is the Wilks Lambda of 0.484, and the corresponding F-approximation of 2.02 with 12, 77 d.f. A p-value of 0.0332 indicates that we can reject the null hypothesis that there is no treatment effect.

Our Conclusion at this point: There are significant differences between at least one pair of treatments in at least one measurement of time $$\left( \Lambda = 0.485; F = 2.02; d.f. = 12, 77; p = 0.0332 \right)$$.

## Next Steps...

If we find that there is a significant difference, then with repeated measures data we tend to focus on a couple of additional questions:

• First Question

Is there a significant treatment by time interaction? Or, in other words, does the effect of treatment depend on the observation time? Previously in the ANOVA analysis, this question was evaluated by looking at the F-value, 2.06. This was reported as a significant result. If we find that this is a significant interaction, the next thing we need to address is, what is the nature of that interaction?

• Alternative Question

If we do not find a significant interaction, then we can collapse the data and determine if the average sinus potassium level over time differs significantly among treatments.  Here, we are looking at the main effects of treatment.

Let's proceed...

# 9.5 - Step 2: Test for treatment by time interactions

9.5 - Step 2: Test for treatment by time interactions

#### Using SAS

To test for treatment by time interactions we need to carry out a Profile Analysis. We can create a Profile Plot as shown in the Dog SAS program. (This program is similar in structure to swiss13a.sas used in the Hotelling's T-square lesson previously.)

Here, we want to plot the treatment means against time for each of our four treatments. We can then examine the form the interactions take if they are deemed significant.

options ls=78;
title "Profile Plot - Dog Data";
data dogs;
infile "D:\Statistics\STAT 505\data\dog1.txt";
input treat dog p1 p2 p3 p4;
time=1;  k=p1; output;
time=5;  k=p2; output;
time=9;  k=p3; output;
time=13; k=p4; output;
drop p1 p2 p3 p4;
run;
proc sort;
by treat time;
run;
proc means;
by treat time;
var k;
output out=a mean=mean;
filename t1 "dog.ps";
goptions device=ps300 gsfname=t1 gsfmode=replace;
proc gplot;
axis1 length=4 in;
axis2 length=6 in;
plot mean*time=treat / vaxis=axis1 haxis=axis2;
symbol1 v=J f=special h=2 i=join color=black;
symbol2 v=K f=special h=2 i=join color=black;
symbol3 v=L f=special h=2 i=join color=black;
symbol4 v=M f=special h=2 i=join color=black;
run;

This program plots the treatment means against time, separately for each treatment. Here, the means for treatment 1 are given by the circles, treatment 2 squares, treatment 3 triangles and treatment 4 stars.

The test for interaction tests the hypothesis that these lines segments are parallel to one another.

To test for interaction, we define a new data vector for each observation. Here we consider the data vector for dog j receiving treatment i. This data vector is obtained by subtracting the data from time 2 minus the data from time 1, the data from time 3 minus the data from time 2, and so on...

This yields the vector of differences between successive times and is expressed as follows:

$$\mathbf{Z}_{ij} = \left(\begin{array}{c}Z_{ij1}\\ Z_{ij2} \\ \vdots \\ Z_{ij, t-1}\end{array}\right) = \left(\begin{array}{c}Y_{ij2}-Y_{ij1}\\ Y_{ij3}-Y_{ij2} \\ \vdots \\Y_{ijt}-Y_{ij,t-1}\end{array}\right)$$

Because this vector is a function of the random data, it is a random vector, and so has a population mean. Thus, for treatment i, we define the population mean vector $$E(\mathbf{Z}_{ij}) = \boldsymbol{\mu}_{Z_i}$$.

Then we will perform a MANOVA on these $$Z_{ij}$$'s to test the null hypothesis that

$$H_0\colon \boldsymbol{\mu}_{Z_1} = \boldsymbol{\mu}_{Z_2} = \dots = \boldsymbol{\mu}_{Z_a}$$

#### Using SAS

The SAS program carries out this MANOVA procedure in the third MANOVA statement as highlighted below:

In the third manova statement, we are testing for interaction between treatment and time. We obtain the vector Z, by setting m equal to the differences between the data at different times. i.e., p2-p1, p3-p2, and p4-p3. This will carry out the profile analysis, or equivalently, test for interactions between treatment and time.

Let's look at the output. Again, be careful when you look at the results to make sure you are in the right part of the output.

The GLM Procedure
Multivariate Analysis of Variance

M Matrix Describing Transformed Variables

p1 p2 p3 p4
MVAR1 -1 1 0 0
MVAR2 0 -1 1 0
MVAR3 0 0 -1 1

Find the table with the kind of function used in defining the vector MVAR, comprised of the elements MVAR1, MVAR2, and MVAR3.

For MVAR1 we have minus p1 plus p2, for MVAR 2 we have minus p2 plus p3, and so on...

The results are then found below this table in the SAS output:

MANOVA Test Criteria and F Approximations for
the Hypothesis of No Overall treat Effect
on the Variables Defined by the M Matrix Transformation
H = Type III SSCP Matrix for treat
E = Error SSCP Matrix

S=3 M=0.5 N=14

Statistic Value F Value Num DF Den DF Pr > F
Wilks' Lambda 0.59835958 1.91 9 73.163 0.0637
Pillai's Trace 0.44352640 1.85 9 96 0.0689
Hotelling-Lawley Trace 0.60246548 1.96 9 44.068 0.0672
Roy's Greatest Root 0.46206108 4.93 3 32 0.0063

NOTE: F Statistic for Roy's Greatest Root is an upper bound.

Here we get a Wilks Lambda of 0.598 with a supporting F-value of 1.91 with 9 and 73 d.f.

This p-value is not significant if we strictly adhere to the 0.05 significance level.

### Conclusion

There is weak evidence that the effect of treatment depends on time $$\left( \Lambda = 0.598; F = 1.91; d. f. = 9, 73; p = 0.0637 \right)$$.

By reporting the p-value with our results, we allow the reader to make their own judgment regarding the significance of the test. Conservative readers might say that 0.0637 is not significant and categorically state that this is not significant, inferring that there is no evidence for interaction. More liberal readers, however, might say that this is very close and consider this weak evidence for an interaction. When you report the results in this form, including the p-value, you allow the reader to make their own judgment.

# 9.6 - Step 3: Test for the main effects of treatments

9.6 - Step 3: Test for the main effects of treatments

Because the results are deemed to be not significant then the next step is to test for the main effects of the treatment.

We now define a new variable equal to the sum of the observations for each animal. To test for the main treatment effect, consider the following linear combination of the observations for each dog; that is, the sum of all the data points collected for animal j receiving treatment i.

$$Z_{ij} = Y_{ij1}+Y_{ij2}+\dots + Y_{ijt}$$

This is going to be a random variable and a scalar quantity. We could then define the mean as:

$$E(Z_{ij}) = \mu_{Z_i}$$

Consider testing the following hypothesis that all of these means are equal to one another against the alternative that at least two of them are different, or:

$$H_0\colon \mathbf{\mu}_{Z_1} =\mathbf{\mu}_{Z_2} = \dots = \mathbf{\mu}_{Z_a}$$

#### Using SAS

ANOVA on the data Zij is carried out using the following MANOVA statement in the SAS program as shown below:

h=treat sets the hypothesis test about treatments.

Then we set m = p1+p2+p3+p4 to define the random variable Z as in the above.

Now, we must make sure that we are looking at the correct part of the output! We have defined a new variable MVAR in this case, a single variable which indicates that we are summing these four.

Results for Wilks Lambda:

MANOVA Test Criteria and Exact F Statistics for
the Hypothesis of No Overall treat Effect
on the Variables Defined by the M Matrix Transformation
H = Type III SSCP Matrix for treat
E = Error SSCP Matrix

S=1 M=0.5 N=15

Statistic Value F Value Num DF Den DF Pr > F
Wilks' Lambda 0.63985247 6.00 3 32 0.0023
Pillai's Trace 0.3601453 6.00 3 32 0.0023
Hotelling-Lawley Trace 0.56286025 6.00 3 32 0.0023
Roy's Greatest Root 0.56286025 6.00 3 32 0.0023

This indicates that there is a significant main effect of treatment. That is that the mean response of our four-time variables differs significantly among treatments.

#### Conclusion

Treatments have a significant effect on the average coronary sinus potassium over the first four time points following occlusion $$\left( \Lambda = 0.640; F = 6.00; d. f. = 3, 32; p = 0.0023 \right)$$.

In comparing this result with the results obtained from the split-plot ANOVA, we find that they are identical. The F-value, p-value and degrees of freedom are all identical. This is not an accident! This is mathematical equality.

# 9.7 - Approach 3: Mixed Model Analysis

9.7 - Approach 3: Mixed Model Analysis

The problem with the multivariate procedure outlined in the above is that it makes no assumptions regarding the temporal correlation structure of the data, and hence, may be overparameterized leading to poor parameter estimates. The mixed model procedure allows us to look at temporal correlation functions involving a limited number of parameters. The mixed model procedure falls beyond the scope of this class. The following brief outline is intended to be just an overview.

### Approach 3 - Mixed Model Analysis

The mixed model initially looks identical to the split-plot model considered earlier.

$$Y_{ijk} = \mu + \alpha_i + \beta_{j(i)}+ \tau_k + (a\tau)_{ik} + \epsilon_{ijk}$$

where

• $$\mu$$ = overall mean
• $$\alpha _ { i }$$ = effect of treatment i
• $$\beta j ( i )$$ = random effect of dog j receiving treatment i
• $$\tau_k$$ = effect of time k
• $$(a\tau)_{ik}$$ = treatment by time interaction
• $$\epsilon_{ijk}$$ = experimental error

Assumptions

1. The dog effects $$\beta_{j ( i )}$$ are independently sampled from a normal distribution with mean 0 and variance $$\sigma^2_\beta$$ .
2. The errors $$\epsilon_{ijk}$$ from different dogs are independently sampled from a normal distribution with mean 0 and variance $$\sigma^2_\epsilon$$ .
3. The correlation between the errors for the same dog depends only on the difference in observation times:$$|k-k'|$$

Several covariance and correlation functions are listed below.

• Compound Symmetry: $$cov(\epsilon_{ijk}, \epsilon_{ijk'}) = \sigma^2_\epsilon+\sigma^2_\beta$$ if $$k=k'$$ and $$\sigma^2_\beta$$ otherwise. This is the default structure for split plots.
• Autoregressive: $$corr(\epsilon_{ijk}, \epsilon_{ijk'}) = \rho^{|k-k'|}$$
• Autoregressive Moving Average:

$$corr(\epsilon_{ijk}, \epsilon_{ijk'}) = \left\{\begin{array}{cl}\gamma; & \text{if } |k-k'| = 1 \\ \gamma\rho^{|k-k'|-1}; & \text{if } |k-k'| \ge 2\end{array}\right.$$

• Toeplitz: $$corr(\epsilon_{ijk}, \epsilon_{ijk'}) = \rho(|k-k'|)$$
Note!
• The autoregressive model is a special case of a autoregressive moving average model with γ = 1.
• The autoregressive moving average model is a special case of a toeplitz model with

$$\rho(|k-k'|) = \left\{\begin{array}{cl}\gamma; & \text{if } |k-k'| = 1 \\ \gamma\rho^{|k-k'|-1}; & \text{if } |k-k'| \ge 2\end{array}\right.$$

### Analysis

Approach 1: If one model is a special case of another, they can be compared using the -2 log likelihood values output. The difference is approximately chi-squared with degrees of freedom equal to the difference between the numbers of estimated parameters. For example, when comparing the AR(1) model with the ARMA(1,1) model, the difference between their -2 log likelihood values is:

$$237.426 - 237.329 = .097$$

which is less than the chi-square critical value of $$3.85 = \chi^2_{1, 0.05}$$ (the df is 1 because there is one additional parameter estimated with the ARMA(1,1) model). This would not be significant evidence to claim the ARMA(1,1) fits better and the AR(1) model would be preferred.

Approach 2: Models that are not special cases of each other can be compared using AICC or BIC values from the output. Smaller values are better. For example, based on the AICC values for the CS, AR(1), ARMA(1,1), and Toeplitz models below, the AR(1) would be preferred.

 Compound Symmetry 243.9 AR(1) 243.6 ARMA(1,1) 245.7 Toeplitz 247.8

#### Using SAS

The syntax for using SAS's Mixed Model procedure can be seen in the program below.

Note! The first instance of the mixed procedure (without the repeated statement) is using the compound symmetry structure.

options ls=78;
title "Mixed Model Analysis - Dog Data";
data dogs;
infile "D:\Statistics\STAT 505\data\dog1.txt";
input treat dog p1 p2 p3 p4;
time=1;  k=p1; output;
time=5;  k=p2; output;
time=9;  k=p3; output;
time=13; k=p4; output;
drop p1 p2 p3 p4;
run;
proc mixed;
class treat dog time;
model k=treat|time;
random dog(treat);
lsmeans treat|time;
run;
proc mixed;
class treat dog time;
model k=treat|time;
random dog(treat);
repeated / subject=dog(treat) type=ar(1);
run;
proc mixed;
class treat dog time;
model k=treat|time;
random dog(treat);
repeated / subject=dog(treat) type=arma(1,1);
run;
proc mixed;
class treat dog time;
model k=treat|time;
random dog(treat);
repeated / subject=dog(treat) type=toep;
run;

The general format here for the mixed model procedure requires that the data are on separate lines for separate points in time.

Except for the first model, each of the various models will have repeated statements. The second, third and fourth models contain the repeated statements where subject is specified to be the dog within treatments, indicating within which units we have our repeated measures, in this case within each of the dogs.

This is followed by the type option which specifies what model you want. Here we set ar(1) for an autoregressive model., arma(1,1) for a 1,1 autoregressive moving average model and toep, short for a Toeplitz model.

Based on the AR(1) model above, the following hypothesis tests are obtained:

 Effect F d.f. p-value Treatments 6.09 3,32 0.0021 Time 9.84 3,96 < 0.0001 Treatment by Time 1.89 9,96 0.0631

# 9.8 - Summary

9.8 - Summary
In this lesson we learned about:
• The split-plot ANOVA for testing interactions between treatment and time, and the main effects of treatment and time;
• The use of MANOVA to test interactions between treatment and time, and the main effects of treatment and time;
• The shortcomings of the split-plot ANOVA and MANOVA procedures for analyzing repeated measures data.

 [1] Link ↥ Has Tooltip/Popover Toggleable Visibility