The chi-square (\(\chi^2\)) test of independence is used to test for a relationship between two categorical variables. Recall that if two categorical variables are independent, then \(P(A) = P(A \mid B)\). The chi-square test of independence uses this fact to compute expected values for the cells in a two-way contingency table under the assumption that the two variables are independent (i.e., the null hypothesis is true).
Even if two variables are independent in the population, samples will vary due to random sampling variation. The chi-square test is used to determine if there is convincing evidence that the two variables are not independent in the population using the same hypothesis testing logic that we used with one mean, one proportion, etc.
Again, we will be using the five-step hypothesis testing procedure:
The assumptions are that the sample is randomly drawn from the population and that all expected values are at least 5 (we will see what expected values are later).
Our hypotheses are:
\(H_0:\) There is not a relationship between the two variables in the population (they are independent)
\(H_a:\) There is a relationship between the two variables in the population (they are dependent)
Note: When you're writing the hypotheses for a given scenario, use the names of the variables, not the generic "two variables."
Chi-Square Test Statistic
\(\chi^2=\sum \dfrac{(Observed-Expected)^2}{Expected}\)
Expected Cell Value
\(E=\dfrac{row\;total \; \times \; column\;total}{n}\)
The p-value can be found using Minitab. Look up the area to the right of your chi-square test statistic on a chi-square distribution with the correct degrees of freedom. Chi-square tests are always right-tailed tests.
Degrees of Freedom: Chi-Square Test of Independence
\(df=(number\;of\;rows-1)(number\;of\;columns-1)\)
If \(p \leq \alpha\) reject the null hypothesis.
If \(p>\alpha\) fail to reject the null hypothesis.
Write a conclusion in terms of the original research question.