4.2 - The Overall Mean Model

Model 1 - The Overall Mean Model

 

\(Y_{ij}=\mu+\epsilon_{ij}\)

which simply fits an overall or "grand mean". This model reflects the situation where \(H_0\) being true implies \(\mu_1=\mu_2=\cdots=\mu_T\).

To understand how various facades of the model relate to each other, let us look at a toy example with 3 treatments (or factor levels) and 2 replicates of each treatment. 

We have 6 observations, which means that \(\mathbf{Y}\) is a column vector of dimension 6 and so is the error vector \(\boldsymbol{\mathcal{E}}\) where its elements are the random error values associated with the 6 observations. In the GLM model, \(\mathbf{Y}=\mathbf{X}\boldsymbol{\beta}+\boldsymbol{\mathcal{E}}\), the design matrix \(\mathbf{X}\) for the overall mean model turns out to be a 6-dimensional column vector of ones. The parameter vector, \(\boldsymbol{\beta}\) is a scalar equal to \(\mu\), the overall population mean.

That is,

\(\mathbf{Y}=\begin{bmatrix}
2\\
1\\
3\\
4\\
6\\
5
\end{bmatrix} \text{,}\;\;\;\;
\mathbf{X}=\begin{bmatrix}
1\\
1\\
1\\
1\\
1\\
1
\end{bmatrix} \text{,}\;\;\;\;
\boldsymbol{\beta}=[\mu]  \text{,}\;\;\;\; \text{ and } 
\boldsymbol{\epsilon}=\begin{bmatrix}
\epsilon_1\\
\epsilon_2\\
\epsilon_3\\
\epsilon_4\\
\epsilon_5\\
\epsilon_6
\end{bmatrix}\)

Using the method of least squares, the estimates of the parameters in \(\boldsymbol{\beta}\) are obtained as:

\(\hat{\boldsymbol \beta}= (\mathbf{X}^{'} \mathbf{X} ) ^{-1} \mathbf{X}^{' } \mathbf{Y}\)

Using the estimate \(\hat{\boldsymbol{\beta}}\), the \(i^{th}\) predicted response \(\mathbf{\hat{y}}_i\) can be computed as \(\mathbf{\hat{y}}_i\)=\(\mathbf{x}_i^{'} \hat{\boldsymbol{\beta}} \) where \(\mathbf{x}_i^{'}\) denotes the \(i^{th}\) row vector of the design matrix.

In this simplest of cases, we can see how the matrix algebra works. The term \(\mathbf{X}^{'} \mathbf{X}\) would be:

\(\begin{bmatrix}
1 & 1 & 1 & 1 &1 & 1
\end{bmatrix}\ast \begin{bmatrix}
1\\
1\\
1\\
1\\
1\\
1
\end{bmatrix}=1+1+1+1+1+1=6 = n\)

 

The term \(\mathbf{X}^{'} \mathbf{Y}\) would be:

 \(\begin{bmatrix}
1 & 1 & 1 & 1 &1 & 1
\end{bmatrix}\ast \begin{bmatrix}
2\\
1\\
3\\
4\\
5\\
6
\end{bmatrix}=2+1+3+4+5+6=21 = \Sigma Y_i\)

So in this case, the estimate \(\hat{\boldsymbol{\beta}}\) as expected is simply the overall mean = \(\hat{\boldsymbol{\mu}} = \bar{y}_{\cdot\cdot} = 21/6 = 3.5\)

Note that the exponent of \(\mathbf{X}^{'} \mathbf{X}\) in the formula above indicates arithmetic division as \(\mathbf{X}^{'} \mathbf{X}\)  is a scalar in this case. In the more general setting, the superscript of '-1 ' indicates the inverse operation in matrix algebra.

To perform these matrix operations in SAS IML, we will open a regular SAS editor window, and then copy and paste three components from the file (IML Design Matrices):

Example Section

  1. Step 1

    Procedure initiation, and specification of the dependent variable vector, \(\mathbf{Y}\).

    For our example we have:

    /* Initiate IML, define response variable */
    proc iml;
    y={
    2, 
    1,
    3,
    4,
    6,
    5};
    
    
  2. Step 2

    We then enter a design matrix (\(\mathbf{X}\)). For the Overall Mean model and our example data, we have:

    
    x={
    1,
    1,
    1,
    1,
    1,
    1};
    
    
  3. Step 3

    We can now copy and paste a program for the matrix computations to generate results (regression coefficients and ANOVA output):

    beta=inv(x`*x)*x`*y; 
    beta_label={"Beta_0","Beta_1","Beta_2","Beta_3"};
    print beta [label="Regression Coefficients"
    rowname=beta_label];
    n=nrow(y); 
    p=ncol(x); 
    j=j(n,n,1);
    ss_tot = (y`*y) - (1/n)*(y`*j)*y;
    ss_trt = (beta`*(x`*y)) - (1/n)*(y`*j)*y;
    ss_error = ss_tot - ss_trt; 
    total_df=n-1;
    trt_df=p-1;
    error_df=n-p; 
    ms_trt = ss_trt/(p-1);
    ms_error = ss_error / error_df;
    F=ms_trt/ms_error; 
    empty={.}; 
    row_label= {"Treatment", "Error", "Total"}; 
    col_label={"df" "SS" "MS" "F"}; 
    trt_row= trt_df || ss_trt || ms_trt || F; 
    error_row= error_df || ss_error || ms_error || empty;
    tot_row=total_df || ss_tot || empty || empty; 
    aov = trt_row // error_row // tot_row;
    print aov [label="ANOVA"
    rowname=row_label
    colname=col_label];
    
    

Here is a quick video walk-through to show you the process for how you can do this in SAS. (Right-click and select 'Show All' if your browser does not display the entire screencast window.)

The program can then be run to produce the following output:

Regression Coefficients
Beta_0 3.5
ANOVA
Treatment DF SS MS F
0 0    
Error 5 17.5 3.5  
Total 5 17.5    

We see the estimate of the regression coefficient for \(\beta_0\) equals 3.5, which indeed is the overall mean of the response variable, and is also the same value we obtained above for \(\mu\) using 'by-hand' calculations. In this simple case, where the treatment factor has not entered the model, the only item of interest from the ANOVA table would be the \(SS_{Error}\) for later use in the General Linear F-test.

If you would like to see the internal calculations further, you may optionally add the following few lines, to the end of the calculation code.

 /* (Optional) Intermediates in the matrix computations */
xprimex=x`*x; print xprimex;
xprimey=x`*y; print xprimey;
xprimexinv=inv(x`*x); print xprimexinv;
check=xprimexinv*xprimex; print check;
SumY2=beta`*(x`*y); print SumY2;
CF=(1/n)*(y`*j)*y; print CF;

This additional code produces the following output:

xprimex
6
xprimey
21
xprimexinv
0.1666667
check
1
SumY2
73.5
CF
73.5

For this model, we effectively have one treatment level \(T=1\) with six replicates \(n_1=N=6\). Using this and the formula for \(SS_{trt}\) at the bottom of the previous section, we can verify the computations for \(SS_{trt} = (\sum_{j=1}^{N}Y_{ij})^2 / N-CF = 0\).

The ‘check’ calculation confirms that \((\mathbf{X}^{'} \mathbf{X})^{-1}\mathbf{X}^{'}\mathbf{X} = 1\), which in fact defines the matrix division operation. In this simple case, it amounts to simple division by n but in other models that we will work with the matrix division process is more complicated. In general, the inverse of a matrix \(\mathbf{A}\) denoted \(\mathbf{A}^{-1}\), is defined by the matrix identity \(\mathbf{A}^{-1}\mathbf{A}=I\), where \(I\), is the identity matrix (a diagonal matrix of ‘1’s). In this example, \(\mathbf{A}\) is replaced by \(\mathbf{X}^{'}\mathbf{X}\), which is a scalar and equals 6.

First, we input the response variable and design matrix.

y = matrix(c(2,1,3,4,6,5), ncol=1)
x = matrix(c(1,1,1,1,1,1), ncol=1)

We can then calculate the regression coefficients.

beta = solve(t(x)%*%x)%*%(t(x)%*%y)
     [,1]
[1,]  3.5

Next we calculate the entries of the ANOVA Table.

n = nrow(y)
p = ncol(x)
J = matrix(1,n,n)
ss_tot = (t(y)%*%y) - (1/n)*(t(y)%*%J)%*%y 
ss_trt = t(beta)%*%(t(x)%*%y) - (1/n)*(t(y)%*%J)%*%y 
ss_error = ss_tot - ss_trt 
total_df = n-1 
trt_df = p-1 
error_df = n-p 
MS_trt = ss_trt/(p-1)
MS_error = ss_error / error_df 
Fstat = MS_trt/MS_error

Then we can build and print the ANOVA table.

ANOVA <- data.frame(
c ("","Treatment","Error", "Total"),
c("DF", trt_df,error_df,total_df),
c("SS", ss_trt, ss_error, ss_tot),
c("MS", MS_trt, MS_error, ""),
c("F",Fstat,"",""),
stringsAsFactors = FALSE)
names(ANOVA) <- c(" ", "  ", " ","","")
print(ANOVA)
1           DF   SS  MS   F
2 Treatment  0    0 NaN NaN
3     Error  5 17.5 3.5    
4     Total  5 17.5  

The intermediate matrix computations can be calculated with the following commands (outputs not shown).

xprimex = t(x)%*%x 
xprimex
xprimey = t(x)%*%y 
xprimey
xprimexinv = solve(t(x)%*%x) 
xprimexinv
check = xprimexinv*xprimex 
check
SumY2 = t(beta)%*%(t(x)%*%y) 
SumY2
CF = (1/n)*(t(y)%*%J)%*%y 
CF

This model can be run with built-in R functions as follows:

options(contrasts=c("contr.sum","contr.poly"))
lm1 = lm(y~1)
beta = lm1$coefficients
beta

(Intercept) 
3.5 

To obtain the ANOVA table, use the following commands.

library(car)
aov3 = Anova(lm1,type = 3) 
aov3

Anova Table (Type III tests)
Response: y
Sum Sq Df F value   Pr(>F)   
(Intercept)   73.5  1      21 0.005934 **
Residuals     17.5  5                    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

In this case, the Type 3 ANOVA table is equivalent to the Type 1 table.

aov1 = aov(lm1)
summary(aov1)

Df Sum Sq Mean Sq F value Pr(>F)
Residuals    5   17.5     3.5