9.1 - ANCOVA with Quantitative Factor Levels

Designed experiments often contain treatment levels that have been set with increasing numerical values. For example, a chemical process may be hypothesized to vary by two factors: the Reagent type (A or B), and temperature. So the researchers may conduct an experiment that investigates a response for each reagent type at 40, 50, 60, 70 and 80 degrees (F) for each of the Reagent types. This is a factorial treatment design, and let's say they have 3 replications of each reagent × temperature combination administered in a completely randomized design.

When treatment levels are quantitative and there are at least 3 levels of measurement, we can use ANCOVA to investigate the quantitative factor level with regression. Continuous (regression) variables can be added into our ANOVA through the design matrix of the General Linear Model, as we encountered in Lesson 3a.1-5. Here, for the quantitative factor, we don't code these factor levels, but instead enter them into the design matrix (X) with their numerical values.

For the experiment mentioned above, we have a simulated dataset in the file: Example_for Quant_factor. in the Week 11 Lesson folder.

We can proceed as usual with a 2 × 5 factorial ANOVA to evaluate the Null Hypotheses

\(H_0 \colon \mu_A = \mu_B\)

\(H_0 \colon \mu_{40} = \mu_{50} = \mu_{60} = \mu_{70} = \mu_{80}\)

and \(H_0 \colon \text{no interaction}\)

This may provide the researcher with the results they are interested in, but very often it falls short. Although there is an interest in comparing means for the process carried out at the different temperatures, they may also be interested in the trend of the response as the temperature is increased.

Our ANCOVA design matrix enables us to add new columns for fitting a quadratic polynomial function to model the effect of the temperature. We will want to add into the design matrix (in addition to the reagent type) \(\text{temp}\) and \(\text{temp}^2\), to allow us to look at linear and polynomial trends, respectively. In addition, we want to test to see if any of the quantitative factor trend terms interact with the reagent type. To do this, we have to center the covariate by subtracting the mean of the covariate (60) from each temperate level. This is important to avoid a problem referred to as structural multi-collinearity, encountered in multiple linear regression. For a review of the multicollinearity problem, you can visit the following the supplemental materials for STAT 501 where this topic is discussed in more detail:

STAT 501 - Lesson 12: Multicollinearity

Once centered, we create \(x\), \(x^2\) and enter these as continuous covariates in the ANCOVA model.

In SAS, this process would look like this:

/*centering the covariate creating x^2 */
data centered_quant_factor; set quant_factor;
x = temp-60;
x2 = x**2;

proc mixed data=centered_quant_factor method=type3;
class reagent;
model product=reagent x x2
        reagent*x regent*x2;
title 'Centered';

Notice that we specify reagent as a classification variable, but x and \(x^2\) are entering the model as continuous variables. The interaction terms are testing the hypotheses that the regression coefficients differ by categorical treatment level. For example, the Reagent*x term tests the null hypothesis that the slopes of the linear regressions differ between the two reagents \(H_0 \colon \beta_{1 A} = \beta_{1 B}\) where \(\beta_{1 i}\) are slope coefficients.

SAS output:

Type 3 Analysis of Variance
Source DF Sum of Squares Mean Square Expected Mean Square Error Term Error DF F Value Pr > F
reagent 1 3.066357 3.066357 Var(Residual) + Q(reagent) MS (Residual) 24 2.97 0.0977
x 1 97.600495 97.600495 Var(Residual) + Q(x,x*reagent) MS (Residual) 24 94.52 < .0001
x2 1 88.832986 88.832986 Var(Residual) + Q(x2,x2*reagent) MS (Residual) 24 86.03 < .0001
x*reagent 1 0.0341215 0.0341215 Var(Residual) + Q(x*reagent) MS (Residual) 24 0.33 0.5707
x2*reagent 1 0.067586 0.067586 Var(Residual) + Q(x2*reagent) MS (Residual) 24 0.07 0.8003
Residual 24 24.782417 1.032601 Var(Residual)        

We see that:

  1. The reagent effect was not significant (p = 0.0977)
  2. Only the linear and quadratic terms were significant in describing the trend in the response, and linear and quadratic effects were the same for each of the reagent types (no interactions)