Lesson 9: Repeated Measures Analysis
Lesson 9: Repeated Measures AnalysisOverview
Repeated measures data comes from experiments where you take observations repeatedly over time. Under a repeated measures experiment, experimental units are observed at multiple points in time. So instead of looking at an observation at one point in time, we will look at data from more than one point in time. With this type of data we are looking at only a single response variable but measured over time.
In the univariate setting, we generally could expect the responses over time are temporally correlated. Observations collected at points in time close together are more likely to be similar to one another than observations collected far apart from one another. Essentially what we are going to do here is to treat observations collected at different points of time as if they were different variables  this is the multivariate analysis approach. You will see that there will be two distinctly different approaches that are frequently considered in this analysis. One of which involves a univariate analysis.
We will use the following experiment to illustrate the statistical procedures associated with repeated measures data.
Example 91: Dog Experiment
In this experiment we had a completely randomized block experimental design that was carried out to determine the effects of 4 surgical treatments on coronary potassium in a group of 36 dogs. There are 9, 8, 9, and 10 dogs in each group, respectively. Each dog was measured at four different points in time following one of four experimental treatments:
 Control  no surgical treatment is applied
 Extrinsic cardiac denervation immediately prior to treatment.
 Bilateral thoracic sympathectomy and stellectomy 3 weeks prior to treatment.
 Extrinsic cardiac denervation 3 weeks prior to treatment.
Coronary sinus potassium levels were measured at 1, 5, 9, and 13 minutes following a procedure called an occlusion. We are looking at the effect of the occlusion on the coronary sinus potassium levels following different surgical treatments.
Approaches
There are a number of approaches to consider here in order to analyze this type of data. The first of these has been proposed before the advent of modern computing so that it might be carried out using hand calculations. There are two very common historical approaches that one could take to address the analysis. They are followed by a more modern approach:
 Splitplot ANOVA  this is perhaps the most common approach.
 MANOVA  this is what we will focus on in this lesson.
 Mixed Models  the more modern approach
Notation in this Lesson
 \(Y_{ijk}\)= Potassium level for treatment i in dog j at time k
 a = Number of treatments
 \(n_{i}\) = Number of replicates of treatment i
 \(N = n _ { 1 } + n _ { 2 } + \ldots + n _ { a }\) = Total number of experimental units
 t = Number of observations over time
Objectives
 Use a splitplot ANOVA to test for interactions between treatments and time, and the main effects for treatments and time;
 Use a MANOVA to assess test for interactions between treatments and time, and for the main effects of treatments;
 Understand why the splitplot ANOVA may give incorrect results; and
 Understand the shortcomings of the application of MANOVA to repeated measures data.
9.1  Approach 1: Splitplot ANOVA
9.1  Approach 1: Splitplot ANOVAThe Splitplot ANOVA is perhaps the most traditional approach, for which hand calculations are not too unreasonable. It involves modeling the data using the linear model shown below:
Model: \(Y_{ijk} = \mu + \alpha_i + \beta_{j(i)}+ \tau_k + (\alpha\tau)_{ik} + \epsilon_{ijk}\)
Using this linear model we are going to assume that the data for treatment i for dog j at time k is equal to an overall mean μ plus the treatment effect \(\alpha_i\), the effect of the dog within that treatment \(\beta_{j \left( i \right)}\), the effect of time \(τ_k\), the effect of the interaction between time and treatment \(\left(ατ \right)_{ik}\), and the error \(\varepsilon_{ijk}\).
Such that:
 \(\mu\) = overall mean
 \(\alpha_i\) = effect of treatment i
 \(\beta_{j \left( i \right)}\) = random effect of dog j receiving treatment i
 \(\tau_{k}\)= effect of time k
 \(\left( \alpha \tau \right)_{ik}\) = treatment by time interaction
 \(\varepsilon_{ijk}\) = experimental error
Assumptions:
We are going to make the following assumptions about the data:
1. The errors \(\varepsilon_{ijk}\) are independently sampled from a normal distribution with mean 0 and variance \(\sigma^2_{\epsilon}\).
2. The individual dog effects \(\beta_{j \left( i \right)}\) are are also independently sampled from a normal distribution with mean 0 and variance \(\sigma^2_{\beta}\).
3. The effect of time does not depend on the dog; that is, there is no time by dog interaction. Generally,
we need to have this assumption otherwise the results would depend on which animal you were looking at  which would mean that we could not predict much for new animals.
With these assumptions, the random effect of dog and fixed effects for treatment and time, this is called a mixed effects model.
The analysis is carried out in this Analysis of Variance Table shown below:
Source  d.f  SS  MS  F 

Treatment  \(a  1\)  \(SS_{\text {treat}}\)  \(\dfrac { \mathrm { SS } _ { \text { treat } } } { a  1 }\)  \(\dfrac { \mathrm { MS } _ { \text { treat } } } { \mathrm { MS } _ { \text { error } ( a ) } }\) 
Error (a)  \(N  a\)  \(SS_{\text {error (a)} }\)  \(\dfrac { S S _ { \text { error } ( a ) } } { ( N  a ) }\)  
Time  \(t  1\)  \(SS_{\text {time}}\)  \(\dfrac { S S _ { \text { time} } } { ( t  1 ) }\)  \(\dfrac { \mathrm { MS } _ { \text { time} } } { \mathrm { MS } _ { \text { error } ( b ) } }\) 
Treat x Time  \(\left( a  1 ) ( t  1 \right)\)  \(SS_{\text {treat x time}}\)  \(\dfrac { S S _ { \text { treat x times } ( b ) } } { ( a  1 ) ( t  1 ) }\)  \(\dfrac { \mathrm { MS } _ { \text { treat x time } } } { \mathrm { MS } _ { \text { error } ( b ) } }\) 
Error (b)  \(\left( N  a ) ( t  1 \right)\)  \(SS_{\text {error (b) }}\)  \(\dfrac { S S _ { \text { error } ( b ) } } { ( N  a ) ( t  1 ) }\)  
Total  \(Nt  1\)  \(SS_{\text {total}}\) 
where,
a: the number of treatments
N: the total number of all experimental units
t: number of time points
The sources of the variation include treatment; Error (a); the effect of Time; the interaction between time and treatment; and Error (b). Error (a) is the effect of subjects within treatments and Error (b) is the individual error in the model. All these add up to the total.
 Sum of Squares Formulas

Here are the formulas that are used to calculate the various Sums of Squares involved:
\(\begin{array}{lll}SS_{total}& =& \sum_{i=1}^{a}\sum_{j=1}^{n_i}\sum_{k=1}^{t}Y^2_{ijk}Nt\bar{y}^2_{...}\\SS_{treat} &= &t\sum_{i=1}^{a}n_i\bar{y}^2_{i..}  Nt\bar{y}^2_{...}\\SS_{error(a)}& =& t\sum_{i=1}^{a}\sum_{j=1}^{n_i}\bar{y}^2_{ij.}  t\sum_{i=1}^{a}n_i\bar{y}^2_{i..}\\SS_{time}& =& N\sum_{k=1}^{t}\bar{y}^2_{..k}Nt\bar{y}^2_{...}\\SS_{\text{treat x time}} &=& \sum_{i=1}^{a}\sum_{k=1}^{t}n_i\bar{y}^2_{i.k}  Nt\bar{y}^2_{...}SS_{treat} SS_{time}\end{array}\)
Mean Square (MS) is always derived by dividing the Sum of Square term by the corresponding degrees of freedom.
To get the main effects for the treatment we compare the MS treatment to MS error (a)
We will compare these results with the results we get from the MANOVA, the next approach covered in this lesson.
9.2  Example
9.2  ExampleExample 92:
Download the text file containing the data: dog1.txt
Using SAS
We will use the following SAS program below to illustrate this procedure.
Download the SAS Program here: dog2.sas
View the video explanation of the SAS code.
Using Minitab
Currently not available in Minitab
Analysis
Run the SAS program inspecting how the program applies this procedure.
Note in the output where values of interest are located. The results are copied from the SAS output into this table here:
Source

d.f.

SS

MS

F

Treatment 
3

19.923

6.641

6.00

Error (a) 
32

35.397

1.106


Time 
3

6.204

2.068

11.15

Interaction 
9

3.440

0.382

2.06

Error (b) 
96

17.800

0.185


Total 
143

82.320

Hypotheses Tests
Now that we have the results from the analysis, the first thing that we want to look at is the interaction between treatment and time. We want to determine here if the effect of treatment depends on time. Therefore, we will start with:
 The interaction between treatment and time, or:
\(H_0\colon (\alpha\tau)_{ik} = 0 \) for all \( i = 1,2, \dots, a;\) \(k = 1,2, \dots, t\)
Here we need to look at the treatment by interaction term whose Fvalue is reported at 2.06. We want to compare this to an Fdistribution with (a  1)(t  1) = 9 and (N  a)(t  1) = 96 degrees of freedom. The numerator d.f. of 9 is tied to the source variation due to the interaction, while the denominator d.f. is tied to the source of variation due to error(b).
We can reject \(H_0\) at level alpha; if
\(F = \dfrac{MS_{\text{treat x time}}}{MS_{error(b)}} > F_{(a1)(t1), (Na)(t1), \alpha}\)
Therefore, we want to compare this to an F with 9 and 96 degrees of freedom. Here we see that this is significant with a pvalue of 0.0406.
Result: We can conclude that the effect of treatment depends on time (F = 2.06; d. f. = 9, 96; p = 0.0406)Next Steps...
 Because the interaction between treatment and time is significant, the next step in the analysis would be to further explore the nature of that interaction using something called profile plots, (we will look at this later...).
 If the interaction between treatment and time was not significant, the next step in the analysis would be to test for the main effects of treatment and time.
 Let's suppose that we had not found a significant interaction. Let's do this so that you can see what it would look like to consider the effects of treatment.
Consider testing the null hypothesis that there are no treatment effects, or
\(H_0\colon \alpha_1 = \alpha_2 = \dots = \alpha_a = 0\)
To test this null hypothesis, we compute the Fratio between the Mean Square for Treatment and Mean Square for Error (a). We then reject our H_{o} at level α if
\(F = \dfrac{MS_{treat}}{MS_{error(a)}} > F_{a1, Na, a}\)
Here, the numerator degrees of freedom is equal to the number of degrees of freedom a  1 = 3 for treatment, while the denominator degrees of freedom is equal to the number of degrees of freedom N  a = 32 for Error(a).
Result: We can conclude that the treatment significantly affects the mean coronary sinus potassium over the t = 4 sampling times (F = 6.00; d. f. = 3,32; p = 0.0023).  Consider testing the effects of time:
\(H_0\colon \tau_1 = \tau_2 = \dots = \tau_t = 0\)
To test this null hypothesis, we compute the Fratio between Mean Square for Time and Mean Square for Error(b). We then reject H_{o} at level \(\alpha\); if
\(F = \dfrac{MS_{time}}{MS_{error(b)}} > F_{t1, (Na)(t1), \alpha}\)
Here, the numerator degrees of freedom is equal to the number of degrees of freedom t  1 = 3 for time, while the denominator degrees of freedom is equal to the number of degrees of freedom (N  a)(t  1) = 96 for Error(b).
Result: We can conclude that coronary sinus potassium varies significantly over time (F = 11.15; d. f. = 3, 96; p < 0.0001).
9.3  Some Criticisms about the SplitANOVA Approach
9.3  Some Criticisms about the SplitANOVA ApproachThis approach and these results assume a constant correlation between any two observations from the same dog. This assumption is unlikely because, typically, when you have repeated measurements over time, the data from the same subject at two different points of time are temporally correlated. In principle, observations that are collected at times that are close together are going to be more similar to one another than observations that are far apart.
This motivates an alternative approach, which is to treat this situation as a Multivariate Analysis of Variance problem instead of an Analysis of Variance problem.
9.4  Approach 2: MANOVA
9.4  Approach 2: MANOVAWhen taking a multivariate approach, we collect the observations over time from the same dog, dog j receiving treatment i into a vector:
\(\mathbf{Y}_{ij} = \left(\begin{array}{c}Y_{ij1}\\ Y_{ij2} \\ \vdots\\ Y_{ijt}\end{array}\right)\)
We treat the data collected at different points in time as if it were data from different variables. Basically, we have a vector of observations for dog j receiving treatment i and each entry corresponds to data collected at a particular point in time.
The usual assumptions are made for a oneway MANOVA. In this case:
 Dogs receiving treatment i have common mean vector \(\mu_{i}\)
 All dogs have common variancecovariance matrix \(\Sigma\)
 Data from different dogs are independently sampled
 Data are multivariate normally distributed
The Analysis
Step 1: Use a MANOVA to test for overall differences between the mean vectors of the four different observations and the treatments.
Using SAS
We will use the Dog SAS program to perform this multivariate analysis.
Download the SAS program: dog.sas
We use the glm procedure to analyze these data. In this case, we look at a oneway manova. We only really have one classification variable  treatment.
The model statement includes the variables of interest on the lefthand side of the equal sign. In this case, they are p1, p2, p3, and p4, (the potassium levels at four different points in time). We put the explanatory variable, treatment, on the righthand side of the equal sign.
The first manova statement tests the hypothesis that the mean vector of observations over time does not depend on treatment. The print option asks for the error of sums of squares and cross products matrix as well as the partial correlations.
The second manova statement tests for the main effects of treatment. We'll return to this later.
The third manova statement tests for the interaction between treatment and time. We'll also return to this later.
Right now, the result that we want to focus on is the Wilks Lambda of 0.484, and the corresponding Fapproximation of 2.02 with 12, 77 d.f. A pvalue of 0.0332 indicates that we can reject the null hypothesis that there is no treatment effect.
Next Steps...
If we find that there is a significant difference, then with repeated measures data we tend to focus on a couple of additional questions:
 First Question
Is there a significant treatment by time interaction? Or, in other words, does the effect of treatment depend on the observation time? Previously in the ANOVA analysis, this question was evaluated by looking at the Fvalue, 2.06. This was reported as a significant result. If we find that this is a significant interaction, the next thing we need to address is, what is the nature of that interaction?
 Alternative Question
If we do not find a significant interaction, then we can collapse the data and determine if the average sinus potassium level over time differs significantly among treatments. Here, we are looking at the main effects of treatment.
Let's proceed...
9.5  Step 2: Test for treatment by time interactions
9.5  Step 2: Test for treatment by time interactionsUsing SAS
To test for treatment by time interactions we need to carry out a Profile Analysis. We can create a Profile Plot as shown in the Dog SAS program. (This program is similar in structure to swiss13a.sas used in the Hotelling's Tsquare lesson previously.)
Here, we want to plot the treatment means against time for each of our four treatments. We can then examine the form the interactions take if they are deemed significant.
Download the SAS Program: dog1.sas
This program plots the treatment means against time, separately for each treatment. Here, the means for treatment 1 are given by the circles, treatment 2 squares, treatment 3 triangles and treatment 4 stars.
The test for interaction tests the hypothesis that these lines segments are parallel to one another.
To test for interaction, we define a new data vector for each observation. Here we consider the data vector for dog j receiving treatment i. This data vector is obtained by subtracting the data from time 2 minus the data from time 1, the data from time 3 minus the data from time 2, and so on...
This yields the vector of differences between successive times and is expressed as follows:
\(\mathbf{Z}_{ij} = \left(\begin{array}{c}Z_{ij1}\\ Z_{ij2} \\ \vdots \\ Z_{ij, t1}\end{array}\right) = \left(\begin{array}{c}Y_{ij2}Y_{ij1}\\ Y_{ij3}Y_{ij2} \\ \vdots \\Y_{ijt}Y_{ij,t1}\end{array}\right)\)
Because this vector is a function of the random data, it is a random vector, and so has a population mean. Thus, for treatment i, we define the population mean vector \(E(\mathbf{Z}_{ij}) = \boldsymbol{\mu}_{Z_i}\).
Then we will perform a MANOVA on these \(Z_{ij}\)'s to test the null hypothesis that
\(H_0\colon \boldsymbol{\mu}_{Z_1} = \boldsymbol{\mu}_{Z_2} = \dots = \boldsymbol{\mu}_{Z_a} \)
Using SAS
The SAS program carries out this MANOVA procedure in the third MANOVA statement as highlighted below:
Download the SAS program: dog.sas
In the third manova statement, we are testing for interaction between treatment and time. We obtain the vector Z, by setting m equal to the differences between the data at different times. i.e., p2p1, p3p2, and p4p3. This will carry out the profile analysis, or equivalently, test for interactions between treatment and time.
Let's look at the output. Again, be careful when you look at the results to make sure you are in the right part of the output.
p1  p2  p3  p4  

MVAR1  1  1  0  0 
MVAR2  0  1  1  0 
MVAR3  0  0  1  1 
Find the table with the kind of function used in defining the vector MVAR, comprised of the elements MVAR1, MVAR2, and MVAR3.
For MVAR1 we have minus p1 plus p2, for MVAR 2 we have minus p2 plus p3, and so on...
The results are then found below this table in the SAS output:
Statistic  Value  F Value  Num DF  Den DF  Pr > F 

Wilks' Lambda  0.59835958  1.91  9  73.163  0.0637 
Pillai's Trace  0.44352640  1.85  9  96  0.0689 
HotellingLawley Trace  0.60246548  1.96  9  44.068  0.0672 
Roy's Greatest Root  0.46206108  4.93  3  32  0.0063 
NOTE: F Statistic for Roy's Greatest Root is an upper bound.
Here we get a Wilks Lambda of 0.598 with a supporting Fvalue of 1.91 with 9 and 73 d.f.
This pvalue is not significant if we strictly adhere to the 0.05 significance level.
Conclusion
There is weak evidence that the effect of treatment depends on time \( \left( \Lambda = 0.598; F = 1.91; d. f. = 9, 73; p = 0.0637 \right) \).
By reporting the pvalue with our results, we allow the reader to make their own judgment regarding the significance of the test. Conservative readers might say that 0.0637 is not significant and categorically state that this is not significant, inferring that there is no evidence for interaction. More liberal readers, however, might say that this is very close and consider this weak evidence for an interaction. When you report the results in this form, including the pvalue, you allow the reader to make their own judgment.
9.6  Step 3: Test for the main effects of treatments
9.6  Step 3: Test for the main effects of treatmentsBecause the results are deemed to be not significant then the next step is to test for the main effects of the treatment.
We now define a new variable equal to the sum of the observations for each animal. To test for the main treatment effect, consider the following linear combination of the observations for each dog; that is, the sum of all the data points collected for animal j receiving treatment i.
\(Z_{ij} = Y_{ij1}+Y_{ij2}+\dots + Y_{ijt}\)
This is going to be a random variable and a scalar quantity. We could then define the mean as:
\(E(Z_{ij}) = \mu_{Z_i} \)
Consider testing the following hypothesis that all of these means are equal to one another against the alternative that at least two of them are different, or:
\(H_0\colon \mathbf{\mu}_{Z_1} =\mathbf{\mu}_{Z_2} = \dots = \mathbf{\mu}_{Z_a} \)
Using SAS
ANOVA on the data Z_{ij} is carried out using the following MANOVA statement in the SAS program as shown below:
Download the SAS Program: dog.sas
h=treat sets the hypothesis test about treatments.
Then we set m = p1+p2+p3+p4 to define the random variable Z as in the above.
Now, we must make sure that we are looking at the correct part of the output! We have defined a new variable MVAR in this case, a single variable which indicates that we are summing these four.
Results for Wilks Lambda:
Statistic  Value  F Value  Num DF  Den DF  Pr > F 

Wilks' Lambda  0.63985247  6.00  3  32  0.0023 
Pillai's Trace  0.3601453  6.00  3  32  0.0023 
HotellingLawley Trace  0.56286025  6.00  3  32  0.0023 
Roy's Greatest Root  0.56286025  6.00  3  32  0.0023 
This indicates that there is a significant main effect of treatment. That is that the mean response of our fourtime variables differs significantly among treatments.
Conclusion
Treatments have a significant effect on the average coronary sinus potassium over the first four time points following occlusion \( \left( \Lambda = 0.640; F = 6.00; d. f. = 3, 32; p = 0.0023 \right) \).
In comparing this result with the results obtained from the splitplot ANOVA, we find that they are identical. The Fvalue, pvalue and degrees of freedom are all identical. This is not an accident! This is mathematical equality.
9.7  Approach 3: Mixed Model Analysis
9.7  Approach 3: Mixed Model AnalysisThe problem with the multivariate procedure outlined in the above is that it makes no assumptions regarding the temporal correlation structure of the data, and hence, may be overparameterized leading to poor parameter estimates. The mixed model procedure allows us to look at temporal correlation functions involving a limited number of parameters. The mixed model procedure falls beyond the scope of this class. The following brief outline is intended to be just an overview.
Approach 3  Mixed Model Analysis
The mixed model initially looks identical to the splitplot model considered earlier.
\(Y_{ijk} = \mu + \alpha_i + \beta_{j(i)}+ \tau_k + (a\tau)_{ik} + \epsilon_{ijk}\)
where
 \(\mu\) = overall mean
 \(\alpha _ { i }\) = effect of treatment i
 \(\beta j ( i )\) = random effect of dog j receiving treatment i
 \(\tau_k\) = effect of time k
 \((a\tau)_{ik}\) = treatment by time interaction
 \(\epsilon_{ijk}\) = experimental error
Assumptions
 The dog effects \(\beta_{j ( i )}\) are independently sampled from a normal distribution with mean 0 and variance \(\sigma^2_\beta\) .
 The errors \(\epsilon_{ijk}\) from different dogs are independently sampled from a normal distribution with mean 0 and variance \(\sigma^2_\epsilon\) .
 The correlation between the errors for the same dog depends only on the difference in observation times:\(kk'\)
Several covariance and correlation functions are listed below.
 Compound Symmetry: \(cov(\epsilon_{ijk}, \epsilon_{ijk'}) = \sigma^2_\epsilon+\sigma^2_\beta\) if \(k=k'\) and \(\sigma^2_\beta\) otherwise. This is the default structure for split plots.
 Autoregressive: \(corr(\epsilon_{ijk}, \epsilon_{ijk'}) = \rho^{kk'}\)
 Autoregressive Moving Average:
\(corr(\epsilon_{ijk}, \epsilon_{ijk'}) = \left\{\begin{array}{cl}\gamma; & \text{if } kk' = 1 \\ \gamma\rho^{kk'1}; & \text{if } kk' \ge 2\end{array}\right.\)
 Toeplitz: \(corr(\epsilon_{ijk}, \epsilon_{ijk'}) = \rho(kk')\)
 The autoregressive model is a special case of a autoregressive moving average model with γ = 1.
 The autoregressive moving average model is a special case of a toeplitz model with
\(\rho(kk') = \left\{\begin{array}{cl}\gamma; & \text{if } kk' = 1 \\ \gamma\rho^{kk'1}; & \text{if } kk' \ge 2\end{array}\right.\)
Analysis
Approach 1: If one model is a special case of another, they can be compared using the 2 log likelihood values output. The difference is approximately chisquared with degrees of freedom equal to the difference between the numbers of estimated parameters. For example, when comparing the AR(1) model with the ARMA(1,1) model, the difference between their 2 log likelihood values is:
\(237.426  237.329 = .097\)
which is less than the chisquare critical value of \(3.85 = \chi^2_{1, 0.05}\) (the df is 1 because there is one additional parameter estimated with the ARMA(1,1) model). This would not be significant evidence to claim the ARMA(1,1) fits better and the AR(1) model would be preferred.
Approach 2: Models that are not special cases of each other can be compared using AICC or BIC values from the output. Smaller values are better. For example, based on the AICC values for the CS, AR(1), ARMA(1,1), and Toeplitz models below, the AR(1) would be preferred.
Compound Symmetry  243.9 
AR(1)  243.6 
ARMA(1,1)  245.7 
Toeplitz  247.8 
Using SAS
The syntax for using SAS's Mixed Model procedure can be seen in the program below.
Download the SAS Program here: dog3.sas
The general format here for the mixed model procedure requires that the data are on separate lines for separate points in time.
Except for the first model, each of the various models will have repeated statements. The second, third and fourth models contain the repeated statements where subject is specified to be the dog within treatments, indicating within which units we have our repeated measures, in this case within each of the dogs.
This is followed by the type option which specifies what model you want. Here we set ar(1) for an autoregressive model., arma(1,1) for a 1,1 autoregressive moving average model and toep, short for a Toeplitz model.
Based on the AR(1) model above, the following hypothesis tests are obtained:
Effect  F  d.f.  pvalue 
Treatments  6.09  3,32  0.0021 
Time  9.84  3,96  < 0.0001 
Treatment by Time  1.89  9,96  0.0631 
9.8  Summary
9.8  Summary The splitplot ANOVA for testing interactions between treatment and time, and the main effects of treatment and time;
 The use of MANOVA to test interactions between treatment and time, and the main effects of treatment and time;
 The shortcomings of the splitplot ANOVA and MANOVA procedures for analyzing repeated measures data.