5.3.5 - Cochran-Mantel-Haenszel Test

5.3.5 - Cochran-Mantel-Haenszel Test

This is another way to test for conditional independence, by exploring associations in partial tables for \(2 \times 2 \times K\) tables. Recall, the null hypothesis of conditional independence is equivalent to the statement that all conditional odds ratios given the levels \(k\)  are equal to 1, i.e.,

\(H_0 : \theta_{XY(1)} = \theta_{XY(2)} = \cdots = \theta_{XY(K)} = 1\)

The Cochran-Mantel-Haenszel (CMH) test statistic is

\(M^2=\dfrac{[\sum_k(n_{11k}-\mu_{11k})]^2}{\sum_k Var(n_{11k})}\)

where \(\mu_{11k}=E(n_{11})=\frac{n_{1+k}n_{+1k}}{n_{++k}}\) is the expected frequency of the first cell in the \(k\)th partial table assuming the conditional independence holds, and the variance of cell (1, 1) is

\(Var(n_{11k})=\dfrac{n_{1+k}n_{2+k}n_{+1k}n_{+2k}}{n^2_{++k}(n_{++k}-1)}\).

Properties of the CMH statistic

  • For large samples, when \(H_0\) is true, the CMH statistic has a chi-squared distribution with df = 1.
  • If all \(\theta_{XY(k)} = 1\), then the CMH statistic is close to zero
  • If some or all \(\theta_{XY(k)} > 1\), then the CMH statistic is large
  • If some or all \(\theta_{XY(k)} < 1\), then the CMH statistic is large
  • If some \(\theta_{XY(k)} < 1\) and others \(\theta_{XY(k)} > 1\), then the CMH statistic is not as effective; that is, the test works better if the conditional odds ratios are in the same direction and comparable in size.
  • The CMH test can be generalized to \(I \times J \times K\) tables, but this generalization varies depending on the nature of the variables:
    • the general association statistic treats both variables as nominal and thus has df \(= (I βˆ’1)\times(J βˆ’1)\).
    • the row mean scores differ statistic treats the row variable as nominal and column variable as ordinal, and has df \(= I βˆ’ 1\).
    • the nonzero correlation statistic treats both variables as ordinal, and df = 1.

Common odds-ratio estimate

As we have seen before, it’s always informative to have a summary estimate of strength of association (rather than just a hypothesis test). If the associations are similar across the partial tables, we can summarize them with a single value: an estimate of the common odds ratio for a \(2 \times2 \times K\) table is

\(\hat{\theta}_{MH}=\dfrac{\sum_k(n_{11k}n_{22k})/n_{++k}}{\sum_k(n_{12k}n_{21k})/n_{++k}}\)

This is a useful summary statistic especially if the model of homogeneous associations holds, as we will see in the next section.

Example - Boy Scouts and Juvenile Delinquency

For the boy scout data based on the first method of doing individual chi-squared tests in each conditional table we concluded that B and D are independent given S. Here we repeat our analysis using the CMH test.

In the SAS program file boys.sas, the cmh option (e.g., tables SES*scouts*delinquent / chisq cmh) gives the following summary statistics output where the CMH statistics are:

Summary Statistics for scout by delinquent
Controlling for SES

 
Cochran-Mantel-Haenszel Statistics (Based on Table Scores)
Statistic Alternative Hypothesis DF Value Prob
1 Nonzero Correlation 1 0.0080 0.9287
2 Row Mean Scores Differ 1 0.0080 0.9287
3 General Association 1 0.0080 0.9287

The small value of the general association statistic, CMH = 0.0080 which is very close to zero indicates that conditional independence model is a good fit for this data; i.e., we cannot reject the null hypothesis.

The hypothesis of conditional independence is tenable, thus \(\theta_{BD(\text{high})} = \theta_{BD(\text{mid})} = \theta_{BD(\text{low})} = 1\), is also tenable. Below, we can see that the association can be summarized with the common odds ratio value of 0.978, with a 95% CI (0.597, 1.601).

 
Common Odds Ratio and Relative Risks
Statistic Method Value 95% Confidence Limits
Odds Ratio Mantel-Haenszel 0.9777 0.5970 1.6010
  Logit 0.9770 0.5959 1.6020
Relative Risk (Column 1) Mantel-Haenszel 0.9974 0.9426 1.0553
  Logit 1.0015 0.9581 1.0468
Relative Risk (Column 2) Mantel-Haenszel 1.0193 0.6706 1.5495
  Logit 1.0195 0.6712 1.5484

Since \(\theta_{BD(\text{high})} \approx \theta_{BD(\text{mid})} \approx \theta_{BD(\text{low})}\), the CMH is typically a more powerful statistic than the Pearson chi-squared statistic we calculated in the previous section, \(X^2 = 0.160\).

The option in R is mantelhaen.test() and used in the file boys.R as shown below:


#### Cochran-Mantel-Haenszel test

mantelhaen.test(temp)
mantelhaen.test(temp,correct=FALSE)

Here is the output:


Mantel-Haenszel chi-squared test without continuity correction

data:  temp
Mantel-Haenszel X-squared = 0.0080042, df = 1, p-value = 0.9287
alternative hypothesis: true common odds ratio is not equal to 1
95 percent confidence interval:
 0.5970214 1.6009845
sample estimates:
common odds ratio
        0.9776615 

It gives the same value as SAS (e.g., Mantel-Haenszel \(X^2= 0.008\), df = 1, p-value = 0.9287), and it only computes the general association version of the CMH statistic which treats both variables as nominal, which is very close to zero and indicates that conditional independence model is a good fit for this data; i.e., we cannot reject the null hypothesis.

The hypothesis of conditional independence is tenable, thus \(\theta_{BD(\text{high})} = \theta_{BD(\text{mid})} = \theta_{BD(\text{low})} = 1\), is also tenable. Above, we can see that the association can be summarized with the common odds ratio value of 0.978, with a 95% CI (0.597, 1.601).

Since \(\theta_{BD(\text{high})} \approx \theta_{BD(\text{mid})} \approx \theta_{BD(\text{low})}\), the CMH is typically a more powerful statistic than the Pearson chi-squared statistic we calculated in the previous section, \(X^2 = 0.160\).


Legend
[1]Link
Has Tooltip/Popover
 Toggleable Visibility