3.6 - The General Linear Test

This is just a general representation of an F-test based on a full and a reduced model. We will use this frequently when we look at more complex models.

Let's illustrate the general linear test here for the single factor experiment:

First we write the full model, \(Y_{ij} = \mu + \tau_i + \epsilon_{ij}\) and then the reduced model, \(Y_{ij} = \mu + \epsilon_{ij}\) where you don't have a \(\tau_i\) term, you just have an overall mean, \(\mu\). This is a pretty degenerate model that just says all the observations are just coming from one group. But the reduced model is equivalent to what we are hypothesizing when we say the \(\mu_i\) would all be equal, i.e.:

\(H_0 \colon \mu_1 = \mu_2 = \dots = \mu_a\)

This is equivalent to our null hypothesis where the \(\tau_i\)'s are all equal to 0.

The reduced model is just another way of stating our hypothesis. But in more complex situations this is not the only reduced model that we can write, there are others we could look at.

The general linear test is stated as an F ratio:

\(F=\dfrac{(SSE(R)-SSE(F))/(dfR-dfF)}{SSE(F)/dfF}\)

This is a very general test. You can apply any full and reduced model and test whether or not the difference between the full and the reduced model is significant just by looking at the difference in the SSE appropriately. This has an F distribution with (df R - df F), df F degrees of freedom, which correspond to the numerator and the denominator degrees of freedom of this F ratio.

Let's take a look at this general linear test using Minitab...

Example 3.5: Cotton Weight Section

Natural ball of cotton

Remember this experiment had treatment levels 15, 20, 25, 30, 35 % cotton weight and the observations were the tensile strength of the material.

The full model allows a different mean for each level of cotton weight %.

We can demonstrate the General Linear Test by viewing the ANOVA table from Minitab:

STAT > ANOVA > Balanced ANOVA

The \(SSE(R) = 636.96\) with a \(dfR = 24\), and \(SSE(F) = 161.20\) with \(dfF = 20\). Therefore:

\(F^\ast =\dfrac{(636.96-161.20)/(24-20)}{161.20/20}\)

This demonstrates the equivalence of this test to the F-test. We now use the General Linear Test (GLT) to test for Lack of Fit when fitting a series of polynomial regression models to determine the appropriate degree of polynomial.

We can demonstrate the General Linear Test by comparing the quadratic polynomial model (Reduced model), with the full ANOVA model (Full model). Let \(Y_{ij} = \mu + \beta_{1}x_{ij} + \beta_{2}x_{ij}^{2} + \epsilon_{ij}\) be the reduced model, where \(x_{ij}\) is the cotton weight percent. Let \(Y_{ij} = \mu + \tau_i + \epsilon_{ij}\) be the full model.

The General Linear Test - Cotton Weight Example (no sound)

The video above shows the SSE(R) = 260.126 with dfR = 22 for the quadratic regression model. The ANOVA shows the full model with SSE(F) = 161.20 with dfF = 20.

Therefore the GLT is:

\(\begin{eqnarray} F^\ast &=&\dfrac{(SSE(R)-SSE(F))/(dfR-dfF)}{SSE(F)/dfF} \nonumber\\ &=&\dfrac{(260.126-161.200)/(22-20)}{161.20/20}\nonumber\\ &=&\dfrac{98.926/2}{8.06}\nonumber\\ &=&\dfrac{49.46}{8.06}\nonumber\\&=&6.14 \nonumber \end{eqnarray}\)

We reject \(H_0\colon \) Quadratic Model and claim there is Lack of Fit if \(F^{*} > F_{1}-\alpha (2, 20) = 3.49\).

Therefore, since 6.14 is > 3.49 we reject the null hypothesis of no Lack of Fit from the quadratic equation and fit a cubic polynomial. From the viewlet above we noticed that the cubic term in the equation was indeed significant with p-value = 0.015.

We can apply the General Linear Test again, now testing whether the cubic equation is adequate. The reduced model is:

\(Y_{ij} = \mu + \beta_{1}x_{ij} + \beta_{2}x_{ij}^{2} + \beta_{3}x_{ij}^{3} + \epsilon_{ij}\)

and the full model is the same as before, the full ANOVA model:

\(Y_ij = \mu + \tau_i + \epsilon_{ij}\)

The General Linear Test is now a test for Lack of Fit from the cubic model:

\begin{aligned} F^{*} &=\frac{(\operatorname{SSE}(R)-\operatorname{SSE}(F)) /(d f R-d f F)}{\operatorname{SSE}(F) / d f F} \\ &=\frac{(195.146-161.200) /(21-20)}{161.20 / 20} \\ &=\frac{33.95 / 1}{8.06} \\ &=4.21 \end{aligned}

We reject if \(F^{*} > F_{0.95} (1, 20) = 4.35\).

Therefore we do not reject \(H_A \colon\) Lack of Fit and conclude the data are consistent with the cubic regression model, and higher order terms are not necessary.