5.3.5 - Cochran-Mantel-Haenszel Test

This is another way to test for conditional independence, by exploring associations in partial tables for \(2 \times 2 \times K\) tables. Recall, the null hypothesis of conditional independence is equivalent to the statement that all conditional odds ratios given the levels \(k\) are equal to 1, i.e.,

\(H_0 : \theta_{XY(1)} = \theta_{XY(2)} = \cdots = \theta_{XY(K)} = 1\)

The Cochran-Mantel-Haenszel (CMH) test statistic is

\(M^2=\dfrac{[\sum_k(n_{11k}-\mu_{11k})]^2}{\sum_k Var(n_{11k})}\)

where \(\mu_{11k}=E(n_{11})=\frac{n_{1+k}n_{+1k}}{n_{++k}}\) is the expected frequency of the first cell in the \(k\)th partial table assuming the conditional independence holds, and the variance of cell (1, 1) is

\(Var(n_{11k})=\dfrac{n_{1+k}n_{2+k}n_{+1k}n_{+2k}}{n^2_{++k}(n_{++k}-1)}\).

Properties of the CMH statistic Section

For large samples, when \(H_0\) is true, the CMH statistic has a chi-squared distribution with df = 1.
If all \(\theta_{XY(k)} = 1\), then the CMH statistic is close to zero
If some or all \(\theta_{XY(k)} > 1\), then the CMH statistic is large
If some or all \(\theta_{XY(k)} < 1\), then the CMH statistic is large
If some \(\theta_{XY(k)} < 1\) and others \(\theta_{XY(k)} > 1\), then the CMH statistic is not as effective; that is, the test works better if the conditional odds ratios are in the same direction and comparable in size.
The CMH test can be generalized to \(I \times J \times K\) tables, but this generalization varies depending on the nature of the variables:
- the general association statistic treats both variables as nominal and thus has df \(= (I −1)\times(J −1)\).
- the row mean scores differ statistic treats the row variable as nominal and column variable as ordinal, and has df \(= I − 1\).
- the nonzero correlation statistic treats both variables as ordinal, and df = 1.

Common odds-ratio estimate Section

As we have seen before, it’s always informative to have a summary estimate of strength of association (rather than just a hypothesis test). If the associations are similar across the partial tables, we can summarize them with a single value: an estimate of the common odds ratio for a \(2 \times2 \times K\) table is

\(\hat{\theta}_{MH}=\dfrac{\sum_k(n_{11k}n_{22k})/n_{++k}}{\sum_k(n_{12k}n_{21k})/n_{++k}}\)

This is a useful summary statistic especially if the model of homogeneous associations holds, as we will see in the next section.

Example - Boy Scouts and Juvenile Delinquency Section

For the boy scout data based on the first method of doing individual chi-squared tests in each conditional table we concluded that B and D are independent given S. Here we repeat our analysis using the CMH test.

In the SAS program file boys.sas, the cmh option (e.g., tables SES*scouts*delinquent / chisq cmh) gives the following summary statistics output where the CMH statistics are:

Summary Statistics for scout by delinquent
Controlling for SES


Cochran-Mantel-Haenszel Statistics (Based on Table Scores)
Statistic	Alternative Hypothesis	DF	Value	Prob
1	Nonzero Correlation	1	0.0080	0.9287
2	Row Mean Scores Differ	1	0.0080	0.9287
3	General Association	1	0.0080	0.9287

The small value of the general association statistic, CMH = 0.0080 which is very close to zero indicates that conditional independence model is a good fit for this data; i.e., we cannot reject the null hypothesis.

The hypothesis of conditional independence is tenable, thus \(\theta_{BD(\text{high})} = \theta_{BD(\text{mid})} = \theta_{BD(\text{low})} = 1\), is also tenable. Below, we can see that the association can be summarized with the common odds ratio value of 0.978, with a 95% CI (0.597, 1.601).


Common Odds Ratio and Relative Risks
Statistic	Method	Value	95% Confidence Limits
Odds Ratio	Mantel-Haenszel	0.9777	0.5970	1.6010
	Logit	0.9770	0.5959	1.6020
Relative Risk (Column 1)	Mantel-Haenszel	0.9974	0.9426	1.0553
	Logit	1.0015	0.9581	1.0468
Relative Risk (Column 2)	Mantel-Haenszel	1.0193	0.6706	1.5495
	Logit	1.0195	0.6712	1.5484

Since \(\theta_{BD(\text{high})} \approx \theta_{BD(\text{mid})} \approx \theta_{BD(\text{low})}\), the CMH is typically a more powerful statistic than the Pearson chi-squared statistic we calculated in the previous section, \(X^2 = 0.160\).

The option in R is mantelhaen.test() and used in the file boys.R as shown below:


#### Cochran-Mantel-Haenszel test

mantelhaen.test(temp)
mantelhaen.test(temp,correct=FALSE)

Here is the output:


Mantel-Haenszel chi-squared test without continuity correction

data:  temp
Mantel-Haenszel X-squared = 0.0080042, df = 1, p-value = 0.9287
alternative hypothesis: true common odds ratio is not equal to 1
95 percent confidence interval:
 0.5970214 1.6009845
sample estimates:
common odds ratio
        0.9776615

It gives the same value as SAS (e.g., Mantel-Haenszel \(X^2= 0.008\), df = 1, p-value = 0.9287), and it only computes the general association version of the CMH statistic which treats both variables as nominal, which is very close to zero and indicates that conditional independence model is a good fit for this data; i.e., we cannot reject the null hypothesis.

The hypothesis of conditional independence is tenable, thus \(\theta_{BD(\text{high})} = \theta_{BD(\text{mid})} = \theta_{BD(\text{low})} = 1\), is also tenable. Above, we can see that the association can be summarized with the common odds ratio value of 0.978, with a 95% CI (0.597, 1.601).

Since \(\theta_{BD(\text{high})} \approx \theta_{BD(\text{mid})} \approx \theta_{BD(\text{low})}\), the CMH is typically a more powerful statistic than the Pearson chi-squared statistic we calculated in the previous section, \(X^2 = 0.160\).