12.2 - Modeling Longitudinal Data with GEE

Printer-friendly versionPrinter-friendly version

Example - Schizophrenia

Data analyzed by Hedeker and Gibbons (1997). A randomized trial for schizophrenia where:

  • 312 patients received drug therapy; 101 received placebo
  • measurements at weeks 0, 1, 3, 6, but some subjects have missing data due to dropout
  • outcome: severity of illness (1 = normal, ... , 7 = extremely ill)

"Spaghetti plot" of response curves for all subjects


Responses for drug patients:


Responses for placebo patients:


Average for each group at each time point:


Same plot versus \( \sqrt{\text{week}}\) :


As shown by the second plot, the average trajectories for the placebo and drug groups appear to be approximately linear when plotted against the square root of week.

At baseline (week0), the two groups have very similar averages. This makes sense. In a randomized trial, the groups are initially just a random division of the subjects; there should be no "treatment" effect because the treatment hasn't yet started. If there were a difference at baseline, it would lead us to believe that the randomization was not carried out properly.

Let's fit a model for mean response with

  • an intercept,
  • a main effect for group,
  • a main effect for\( \sqrt{\text{week}}\), and
  • an interaction between group and \( \sqrt{\text{week}}\).

This allows the two groups to have different intercepts and slopes. Because the intercepts are defined as the average responses at week 0, we expect that the main effect for group (i.e., the difference in intercepts will be small).

How can we fit this model, taking into account the fact that the multiple observations for a subject are correlated?

Let's try GEE, more specifically first IEE. We will look at the normal rather than a multinomial model just to demonstrate the IEE. However, you should investigate the given SAS code and change the parameters and specify the multinomial distribution and compare your results.

datasetData: The data from the schizophrenia trial (schiz.dat).

  • Column 1: subject ID
  • Column 2: group (0 = placebo, 1 = drug)
  • Column 3: week (0, 1, 3, 6)
  • Column 4: severity of illness (1, ... ,7)

SAS program (schiz.sas) for fitting model by independence estimating equations: NO R CODE!

SAS program

What's going on?

  • The model statement without any additional options tells SAS to apply a normal error distribution with constant variance. The default link function for the normal model is the identity link.
  • The repeated statement tells PROC GENMOD to fit the GEE with an independence correlation structure (type=ind). The observations are grouped by the class variable subject. The option modelse tells SAS to print out model-based SE's along with those from the sandwich.

Together, these two statements specify an estimation procedure equivalent to ML under an ordinary linear regression model; in other words, the resulting estimates are simply OLS.

What is the advantage of using PROC GENMOD here?

The advantage is that the standard errors are computed using the sandwich; if the sample is sufficiently large, then the SE's are going to be reasonable even if the assumptions of independence and constant variance are wrong.

Let's look at some results.

SAS output

From the "Model information" section, we see that GENMOD is fitting a normal model with an identity link. As we learned, however, the normality doesn't matter; the only part of the normal model being relied upon is the assumption of constant variance, Var(yij )= σ2 .

Notice also that out of 413 × 4 = 1652 patient-occasions, only 1600 of them contributed to the model fit; the other 152 had missing values. The exact same results would have been obtained if we had omitted the rows with missing responses from the data file. Now examine the model fit (not the GEE):

SAS output

The unscaled Pearson and Deviance statistics assume that the scale parameter σ2 is equal to 1.00. In practice, we would never assume σ2 =1.00 for a normal model; we would always estimate it. The estimated scale parameter is equal to Pearson's X2 divided by the degrees of freedom,


Now the table of coefficients:

SAS output

The heading "Empirical standard error estimates" means that the SE's in this table are based on the sandwich.

As we expected, the coefficient for drug, the estimated difference in intercepts, is very small.

The estimated difference in slopes, −0.5243, is highly significant, indicating that the responses are declining over time more quickly for the drug group than for the placebo group.

Because we specified the modelse option, SAS also prints out model-based standard errors. These assume that the working covariance assumption(constant variance and uncorrelated errors within subjects) is correct:

SAS output

In this case, the model-based standard errors are somewhat larger than their sandwich counterparts.

Changing Covariance Structure

With PROC GENMOD, we can also try alternative assumptions about the within-subject correlation structure.

The type=exch or type=cs option specifies an "exchangeable" or "compound symmetry assumption," in which the observations within a subject are assumed to be equally correlated:

1 & \rho & \rho & \cdots & \rho\\
\rho & 1 & \rho & \cdots & \rho\\
\rho & \rho & 1 & \cdots & \rho\\
\vdots & \vdots & \vdots & \ddots & \vdots\\
\rho & \rho & \rho & \cdots & 1

This is the same kind of structure that arises in a normal linear model with random intercepts. It acknowledges that correlations exist within a subject, but it assumes that the correlations are constant over time.

An autoregressive(AR-1) structure, specified by type=ar, allows the correlations to diminish over time as

\(Corr(y_{ij},y_{ij'})=\rho ^{|t_{ij}-t_{ij'}|}\),

where tij and tij′ are the observation times for yij and yij′ .

The exchangeable and the autoregressive structures both express the intra-subject correlations in terms of a single parameter ρ. If the subjects are measured at a relatively small common set of occasions, we may be able to estimate an arbitrary correlation matrix.

With ni = 4 measurement times per subject, the unstructured matrix would have six correlations to estimate. An unstructured matrix is obtained by the option type=un.

Which correlation structure is best for any given problem?

In principle, it makes sense to think that the one that is most nearly "correct" would be best.

For example, if the autoregressive or exchangeable assumptions are wrong, one might think that the unstructured form would provide the most efficient answer. But that is not necessarily true, because moving to an unstructured matrix introduces many more unknown parameters which could destabilize the model fit.

In certain cases, a wrong structure with a small number of parameters could perform better than a correct structure with many parameters. Unfortunately, the literature does not provide much guidance on how to choose a good working correlation structure for GEE.

Here is a summary of the results from different working correlation structures applied to the data from the schizophrenia trial:


Moving from an independence structure to exchangeable and AR-1, the estimated coefficient for the effect of interest (drug*sqrtweek) becomes larger. The movement from −.52 to −.60 is a change of about one standard error, which is not trivial but not huge. The SE's under independence, exchangeable and AR-1 are similar, but under an unstructured correlation it gets larger, presumably because of the extra parameters being estimated.

If I had to pick a single answer here, I would probably report the results from exchangeable or AR-1. The independence structure is not plausible, and the unstructured may be unnecessarily general.

Next, you should play with this problem and change options in GENMOD.

For example, specify the DIST=multinomial and LINK=clogit for polytomous logistic regression.

For more examples, on GEE and binomial and polytomous response see references in Agresti (2013, 2007) and SAS online example.

To do the same analysis in R, we need to use either the gee package or geepack package. The main difference between the two is that the latter contains an ANOVA method that allows for fit comparsions.