Say we have a study of two categorical variables each with only two levels. One of the response levels is considered the "success" response and the other the "failure" response. A general 2 *×* 2 table of the observed counts would be as follows:

Success | Failure | Total | |
---|---|---|---|

Group 1 |
A |
B |
A + B |

Group 2 |
C |
D |
C + D |

The observed counts in this table represent the following proportions:

Success | Failure | Total | |
---|---|---|---|

Group 1 |
\(\hat{p}_1=\frac{A}{A+B}\) |
\(1-\hat{p}_1\) |
A + B |

Group 2 |
\(\hat{p}_2=\frac{C}{C+D}\) |
\(1-\hat{p}_2\) |
C + D |

Recall from our Z-test of two proportions that our null hypothesis is that the two population proportions, \(p_1\) and \(p_2\), were assumed equal while the two-sided alternative hypothesis was that they were not equal.

This null hypothesis would be analogous to the two groups being independent.

Also, if the two success proportions are equal, then the two failure proportions would also be equal. Note as well that with our Z-test the conditions were that the number of successes and failures for each group was at least 5. That equates to the Chi-square conditions that all expected cells in a 2 *×* 2 table be at least 5. (Remember at least 80% of all cells need an expected count of at least 5. With 80% of 4 equal to 3.2 this means all four cells must satisfy the condition).

When we run a Chi-square test of independence on a 2 *×* 2 table, the resulting Chi-square test statistic would be equal to the square of the Z-test statistic (i.e., \((Z^*)^2\)) from the Z-test of two independent proportions.

##
Application

###
Political Affiliation and Opinion
Section* *

Consider the following example where we form a 2 *×* 2 for the Political Party and Opinion by only considering the Favor and Opposed responses:

favor | oppose | Total | |
---|---|---|---|

democrat |
138 |
64 |
202 |

republican |
64 |
84 |
148 |

Total |
202 |
148 |
350 |

The Chi-square test produces a test statistic of 22.00 with a *p*-value of 0.00

The Z-test comparing the two sample proportions of \(\hat{p}_d=\frac{138}{202}=0.683\) minus \(\hat{p}_r=\frac{64}{148}=0.432\) results in a Z-test statistic of \(4.69\) with p-value of \(0.000\).

If we square the Z-test statistic, we get \(4.69^2 = 21.99\) or \(22.00\) with rounding error.

##
Try it!
Section* *

The condiments and gender data were condensed to consider gender and either mustard or ketchup. The manager wants to know if the proportion of males that prefer ketchup is the same as the proportion of females that prefer ketchup. Test the hypothesis two ways (1) using the Chi-square test and (2) using the z-test for independence with a significance level of 10%. Show how the two test statistics are related and compare the p-values.

Condiment | ||||
---|---|---|---|---|

Gender | Ketchup | Mustard | Total | |

Male | 15 | 23 | 38 | |

Female | 25 | 19 | 44 | |

Total | 40 | 42 | 82 |

#### Z-test for two proportions

The hypotheses are:

\(H_0\colon p_1-p_2=0\)

\(H_a\colon p_1-p_2\ne 0\)

Let males be denoted as sample one and females as sample two. Using the table, we have:

\(n_1=38\) and \(\hat{p}_1=\frac{15}{38}=0.395\)

\(n_2=44\) and \(\hat{p}_2=\frac{25}{44}=0.568\)

The conditions are satisfied for this test (verify for extra practice).

To calculate the test statistic, we need:

\(p^*=\dfrac{x_1+x_2}{n_1+n_2}=\dfrac{15+25}{38+44}=\dfrac{40}{82}=0.4878\)

The test statistic is:

\begin{align} z^*&=\dfrac{\hat{p}_1-\hat{p}_2-0}{\sqrt{p^*(1-p^*)\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}}\\&=\dfrac{0.395-0.568}{\sqrt{0.4878(1-0.4878)\left(\frac{1}{38}+\frac{1}{44}\right)}}\\&=-1.567\end{align}

The p-value is \(2P(Z<-1.567)=0.1172\).

The p-value is greater than our significance level. Therefore, there is not enough evidence in the data to suggest that the proportion of males that prefer ketchup is different than the proportion of females that prefer ketchup.

#### Chi-square Test for independence

The expected count table is:

Condiment | ||||
---|---|---|---|---|

Gender | Ketchup | Mustard | Total | |

Male | 15 (18.537) | 23 (19.463) | 38 | |

Female | 25 (21.463) | 19 (22.537) | 44 | |

Total | 40 | 42 | 82 |

There are no expected counts less than 5. The test statistic is:

\(\chi^{2*}=\dfrac{(15-18.537)^2}{18.537}+\dfrac{(23-19.463)^2}{19.463}+\dfrac{(25-21.463)^2}{21.463}+\dfrac{(19-22.537)^2}{22.537}=2.46 \)

With 1 degree of freedom, the p-value is 0.1168. The p-value is greater than our significance value. Therefore, there is not enough evidence to suggest that gender and condiments (ketchup or mustard) are related.