For the sibling puzzle-solving example, note that the data consisted of 37 responses for six-year-olds (younger siblings) and 37 responses for eight-year-olds (older siblings). How would the results differ if, instead of siblings, those same values had arisen from two independent samples of children of those ages?
To see why the approach with the dependent data, matched by siblings, is more powerful, consider the table below. The sample size here is defined as \(n = 37+37=74\) total responses, compared with \(n=37\) total pairs in the previous approach.
<1 min | >1 min | total | |
Older | 22 | 15 | 37 |
Younger | 20 | 17 | 37 |
The estimated difference in proportions from this table is identical to the previous one:
\(\hat{d}=\hat{p}_1-\hat{p}_2=\dfrac{22}{37}-\dfrac{20}{37}=0.0541\)
But with independent data, the test for an age effect, which is equivalent to the usual \(\chi^2\) test of independence, gives \(X^2 = 0.2202\), compared with McNemar's test gave \(z^2 = 0.333\). The standard error for \(\hat{d}\) in this new table is
\(\sqrt{\dfrac{\hat{p}_1(1-\hat{p}_1)}{37}+\dfrac{\hat{p}_2(1-\hat{p}_2)}{37}}=0.1150\)
compared with \(0.0932\), when using siblings and taking into account the covariance between them. Just as with matched pairs for quantitative variables, the covariance between pair members leads to smaller standard errors and greater overall power, compared with the independent samples approach.
Let's take a look at the last part of Siblings.sas and its relevant output, where the same data are analyzed as if they were sampled independently and not matched by siblings.
data matched;
input time $ approval $ count ;
datalines;
t1 approve 944
t1 disapprove 656
t2 approve 880
t2 disapprove 720
;
proc freq data=matched order=data;
weight count;
tables time*approval /chisq riskdiff;
run;
Now we are doing just a regular test of independence, and the Pearson chi-square is \(0.2202\) with a p-value of 0.02. Although our conclusion seems to be identical (we still can't claim a significant age effect), notice that our p-value is less significant when the data are treated as independent. In general, the matched pairs approach is more powerful.
Statistics for Table of age by time
Statistic | DF | Value | Prob |
---|---|---|---|
Chi-Square | 1 | 0.2202 | 0.6389 |
Likelihood Ratio Chi-Square | 1 | 0.2204 | 0.6388 |
Continuity Adj. Chi-Square | 1 | 0.0551 | 0.8145 |
Mantel-Haenszel Chi-Square | 1 | 0.2173 | 0.6411 |
Phi Coefficient | 0.0546 | ||
Contingency Coefficient | 0.0545 | ||
Cramer's V | 0.0546 |
Column 1 Risk Estimates | ||||||
---|---|---|---|---|---|---|
Risk | ASE | 95% Confidence Limits |
Exact 95% Confidence Limits |
|||
Difference is (Row 1 - Row 2) | ||||||
Row 1 | 0.5946 | 0.0807 | 0.4364 | 0.7528 | 0.4210 | 0.7525 |
Row 2 | 0.5405 | 0.0819 | 0.3800 | 0.7011 | 0.3692 | 0.7051 |
Total | 0.5676 | 0.0576 | 0.4547 | 0.6804 | 0.4472 | 0.6823 |
Difference | 0.0541 | 0.1150 | -0.1714 | 0.2795 |
Let's take a look at the last part of Siblings.R and its relevant output, where the same data are analyzed as if they were sampled independently and not matched by siblings.
notsiblings = matrix(c(22,20,15,17),nr=2,
dimnames=list(c("Older","Younger"),c("<1 min",">1 min")))
notsiblings
chisq.test(notsiblings, correct=F)
prop.test(notsiblings, correct=F)
Now we are doing just a regular test of independence, and the Pearson chi-square is \(0.2202\) with a p-value of 0.02. Although our conclusion seems to be identical (we still can't claim a significant age effect), notice that our p-value is less significant when the data are treated as independent. In general, the matched pairs approach is more powerful.
> chisq.test(notsiblings, correct=F)
Pearson's Chi-squared test
data: notsiblings
X-squared = 0.22024, df = 1, p-value = 0.6389
> prop.test(notsiblings, correct=F)
2-sample test for equality of proportions without continuity
correction
data: notsiblings
X-squared = 0.22024, df = 1, p-value = 0.6389
alternative hypothesis: two.sided
95 percent confidence interval:
-0.1713610 0.2794691
McNemar’s test applies whenever the hypothesis of marginal homogeneity is of interest. Dependency and marginal homogeneity may arise in varieties of problems involving dependency.