11.3.1 - Symmetry Model

A rather restrictive form of agreement between two variables is symmetry, where the expected counts satisfy \(\mu_{ij}=\mu_{ji}\) for all \(i\) and \(j\). As the name suggests, this means the table of expected counts is a symmetric matrix.

Using the notation from the Siskel and Ebert example with general log-linear model

\(\log\mu_{ij}=\lambda+\lambda^S_i+\lambda^E_j+\lambda^{SE}_{ij}\)

the symmetry assumption would mean that

\(\lambda_{ij}^{SE}=\lambda^{SE}_{ji}\)
\(\lambda^S_i=\lambda^E_i\)

for all \(i\) and \(j\). Note that this differs from the independence assumption that all \(\lambda_{ij}^{SE}=0\). Both are restrictive models, but neither is a special case of the other.

For an \(I\times I\) table, then there are \(I\) total counts along the main diagonal and \(I(I-1)\) total counts off the main diagonal, giving \(I+(I(I-1)/2\) unique parameters to be estimated to fit the symmetry model.

The challenge in actually fitting this model, however, is that the symmetry restrictions are not simply removing certain terms (setting them equal to 0). Rather, the restriction is that certain counts---the off-diagonal ones---are required to be equal. So, we use a regression approach with some carefully defined indicator variables. Specifically, for each observation in the sample, we take \(I_{ij}=1\) if the observation falls in row \(i\) and column \(j\) OR if the observation falls in row \(j\) and column \(i\), for all \(i\le j\le I\); otherwise \(I_{ij}=0\).

The idea is that the symmetry model views the \(n_{ij}\) count and the \(n_{ji}\) count as coming from a common population, and so only a single parameter is attributed to its mean. For the Siskel and Ebert example, this reduces to

\(\log\mu_{ij}=\beta_1I_{11}+\beta_2I_{22}+\beta_3I_{33}+ \beta_4I_{12}+\beta_5I_{13}+\beta_6I_{23} \)

Thus, the \(i^{th}\) main diagonal mean is \(\mu_{ii}=e^{\beta_i}\), and the \((i,j)^{th}\) off-diagonal mean is \(\mu_{ij}=\mu_{ji}=e^{\beta}\), where \(\beta\) is the coefficient for \(I_{ij}\) (for \(i<j\)).

Let's see how we can do this in SAS and R.

Take a look at the SAS code in MovieCritics.sas. The data used previously is updated with the indicators we'll use for the symmetry model. These replace not only the original variables S and E but also the intercept, giving a total of \(I+I(I-1)/2=6\) parameters in this case (with \(I=3\)).

data critic;
set critic;

/* indicator variables for each diagonal cell*/
I11=0; if siskel='con' AND ebert='con' THEN I11=1;
I22=0; if siskel='mixed' AND ebert='mixed' THEN I22=1;
I33=0; if siskel='pro' AND ebert='pro' THEN I33=1;

/* indicator variables for each off-diagonal cell*/
I12=0; if siskel='con' AND ebert='mixed' THEN I12=1;
 if siskel='mixed' AND ebert='con' THEN I12=1;
I13=0; if siskel='con' AND ebert='pro' THEN I13=1;
 if siskel='pro' AND ebert='con' THEN I13=1;
I23=0; if siskel='mixed' AND ebert='pro' THEN I23=1;
 if siskel='pro' AND ebert='mixed' THEN I23=1;
; run;

/* symmetry */
proc genmod data=critic order=data;
model count=I11 I22 I33 I12 I13 I23 / 
 link=log dist=poisson noint predicted;
title "Symmetry Model";
run;

From the output, note that the deviance statistic \(G^2=0.5928\) is comparing the symmetry model to the saturated model, which has 3 degrees of freedom corresponding to the difference in the models' numbers of parameters: 9 versus 6. The p-value is fairly large, indicating that the symmetry model is a reasonable fit for this data.


Criteria For Assessing Goodness Of Fit
Criterion	DF	Value	Value/DF
Deviance	3	0.5928	0.1976
Scaled Deviance	3	0.5928	0.1976
Pearson Chi-Square	3	0.5913	0.1971
Scaled Pearson X2	3	0.5913	0.1971
Log Likelihood		351.2829
Full Log Likelihood		-20.3921
AIC (smaller is better)		52.7842
AICC (smaller is better)		94.7842
BIC (smaller is better)		53.9676

The estimates of the model parameters are given in the table below. We can exponentiate these to get the mean estimates for the \(9\times 9\) table. For example, the estimate for \(\mu_{13}\) is \(e^{2.4423}=11.5\), which also can be calculated intuitively from the table counts directly:

\(\hat{\mu}_{13}=\dfrac{n_{13}+n_{31}}{2}=\dfrac{13+10}{2}\)


Analysis Of Maximum Likelihood Parameter Estimates
Parameter	DF	Estimate	Standard Error	Wald 95% Confidence Limits		Wald Chi-Square	Pr > ChiSq
Intercept	0	0.0000	0.0000	0.0000	0.0000	.	.
I11	1	3.1781	0.2041	2.7780	3.5781	242.40	<.0001
I22	1	2.5649	0.2774	2.0214	3.1085	85.53	<.0001
I33	1	4.1589	0.1250	3.9139	4.4039	1106.96	<.0001
I12	1	2.0794	0.2500	1.5895	2.5694	69.19	<.0001
I13	1	2.4423	0.2085	2.0337	2.8510	137.20	<.0001
I23	1	2.3026	0.2236	1.8643	2.7408	106.04	<.0001
Scale	0	1.0000	0.0000	1.0000	1.0000

Note:The scale parameter was held fixed.

Take a look at the R code in MovieCritics.R. The data is converted to the "stacked" or long format to fit the log-linear model, and the indicators we need for the symmetry model replace the earlier factors S and E. Also, note that the intercept is dropped with the "-1" specification. This is to avoid an overparameterization, given the \(I+I(I-1)/2=6\) indicator variables are already present (with \(I=3\)).

S = factor(rep(c("con","mixed","pro"),each=3)) # siskel
E = factor(rep(c("con","mixed","pro"),times=3)) # ebert
count = c(24,8,13,8,13,11,10,9,64)

# indicator variables for diagonals
I11 = (S=="con")*(E=="con")
I22 = (S=="mixed")*(E=="mixed")
I33 = (S=="pro")*(E=="pro")

# indicator variables for off-diagonals
I12 = (S=="con")*(E=="mixed") + (E=="con")*(S=="mixed")
I13 = (S=="con")*(E=="pro") + (E=="con")*(S=="pro")
I23 = (S=="mixed")*(E=="pro") + (E=="mixed")*(S=="pro")

# symmetry model
fit.s = glm(count~I11+I12+I13+I22+I23+I33-1,family=poisson(link='log'))

At the end of the output, note that the deviance statistic \(G^2=0.59276\) is comparing the symmetry model to the saturated model, which has 3 degrees of freedom corresponding to the difference in the models' numbers of parameters: 9 versus 6. The p-value is fairly large, indicating that the symmetry model is a reasonable fit to this data.

The estimates of the model parameters are given in the "Coefficients" table below. We can exponentiate these to get the mean estimates for the \(9\times 9\) table. For example, the estimate for \(\mu_{13}\) is \(e^{2.4423}=11.5\), which also can be calculated intuitively from the table counts directly:

\(\hat{\mu}_{13}=\dfrac{n_{13}+n_{31}}{2}=\dfrac{13+10}{2}\)

> summary(fit.s)

Call:
glm(formula = count ~ I11 + I12 + I13 + I22 + I23 + I33 - 1, 
    family = poisson(link = "log"))

Deviance Residuals: 
      1        2        3        4        5        6        7        8  
 0.0000   0.0000   0.4332   0.0000   0.0000   0.3112  -0.4525  -0.3217  
      9  
 0.0000  

Coefficients:
    Estimate Std. Error z value Pr(>|z|)    
I11   3.1781     0.2041  15.569   <2e-16 ***
I12   2.0794     0.2500   8.318   <2e-16 ***
I13   2.4423     0.2085  11.713   <2e-16 ***
I22   2.5649     0.2774   9.248   <2e-16 ***
I23   2.3026     0.2236  10.297   <2e-16 ***
I33   4.1589     0.1250  33.271   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 721.15855  on 9  degrees of freedom
Residual deviance:   0.59276  on 3  degrees of freedom
AIC: 52.784

Number of Fisher Scoring iterations: 4

> mu.s = fitted.values(fit.s)
> matrix(mu.s,nrow=3)
     [,1] [,2] [,3]
[1,] 24.0    8 11.5
[2,]  8.0   13 10.0
[3,] 11.5   10 64.0

Although the symmetry model fits well in this example, it may be too restrictive for other types of data. So, something less restrictive (requiring less agreement) may be more suitable. This leads us to consider "quasi-symmetry".

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility