As suggested in the introduction to this lesson, the test for homogeneity is a method, based on the chi-square statistic, for testing whether two or more multinomial distributions are equal. Let's start by trying to get a feel for how our data might "look" if we have two equal multinomial distributions.
Example 17-1 Section
A university admissions officer was concerned that males and females were accepted at different rates into the four different schools (business, engineering, liberal arts, and science) at her university. She collected the following data on the acceptance of 1200 males and 800 females who applied to the university:
#(Acceptances) | Business | Engineer | Lib Arts | Science | (FIXED) Total |
---|---|---|---|---|---|
Male | 300 (25%) | 240 (20%) | 300 (25%) | 360 (30%) | 1200 |
Female | 200 (25%) | 160 (20%) | 200 (25%) | 240 (30%) | 800 |
Total | 500 (25%) | 400 (20%) | 500 (25%) | 600 (30%) | 2000 |
Are males and females distributed equally among the various schools?
Answer
Let's start by focusing on the business school. We can see that, of the 1200 males who applied to the university, 300 (or 25%) were accepted into the business school. Of the 800 females who applied to the university, 200 (or 25%) were accepted into the business school. So, the business school looks to be in good shape, as an equal percentage of males and females, namely 25%, were accepted into it.
Now, for the engineering school. We can see that, of the 1200 males who applied to the university, 240 (or 20%) were accepted into the engineering school. Of the 800 females who applied to the university, 160 (or 20%) were accepted into the engineering school. So, the engineering school also looks to be in good shape, as an equal percentage of males and females, namely 20%, were accepted into it.
We probably don't have to drag this out any further. If we look at each column in the table, we see that the proportion of males accepted into each school is the same as the proportion of females accepted into each school... which therefore happens to equal the proportion of students accepted into each school, regardless of gender. Therefore, we can conclude that males and females are distributed equally among the four schools.
Example 17-2 Section
A university admissions officer was concerned that males and females were accepted at different rates into the four different schools (business, engineering, liberal arts, and science) at her university. She collected the following data on the acceptance of 1200 males and 800 females who applied to the university:
#(Acceptances) | Business | Engineer | Lib Arts | Science | (FIXED) Total |
---|---|---|---|---|---|
Male | 240 (20%) | 480 (40%) | 120 (10%) | 360 (30%) | 1200 |
Female | 240 (30%) | 80 (10%) | 320 (40%) | 160 (20%) | 800 |
Total | 480 (24%) | 560 (28%) | 440 (22%) | 520 (26%) | 2000 |
Are males and females distributed equally among the various schools?
Answer
Let's again start by focusing on the business school. In this case, of the 1200 males who applied to the university, 240 (or 20%) were accepted into the business school. And, of the 800 females who applied to the university, 240 (or 30%) were accepted into the business school. So, the business school appears to have different rates of acceptance for males and females, 20% compared to 30%.
Now, for the engineering school. We can see that, of the 1200 males who applied to the university, 480 (or 40%) were accepted into the engineering school. Of the 800 females who applied to the university, only 80 (or 10%) were accepted into the engineering school. So, the engineering school also appears to have different rates of acceptance for males and females, 40% compared to 10%.
Again, there's no need drag this out any further. If we look at each column in the table, we see that the proportion of males accepted into each school is different than the proportion of females accepted into each school... and therefore the proportion of students accepted into each school, regardless of gender, is different than the proportion of males and females accepted into each school. Therefore, we can conclude that males and females are not distributed equally among the four schools.
In the context of the two examples above, it quickly becomes apparent that if we wanted to formally test the hypothesis that males and females are distributed equally among the four schools, we'd want to test the hypotheses:
\(H_0 : p_{MB} =p_{FB} \text{ and } p_{ME} =p_{FE} \text{ and } p_{ML} =p_{FL} \text{ and } p_{MS} =p_{FS}\)
\(H_A : p_{MB} \ne p_{FB} \text{ or } p_{ME} \ne p_{FE} \text{ or } p_{ML} \ne p_{FL} \text{ or } p_{MS} \ne p_{FS}\)
where:
- \(p_{Mj}\) is the proportion of males accepted into school j = B, E, L, or S
- \(p_{Fj}\) is the proportion of females accepted into school j = B, E, L, or S
In conducting such a hypothesis test, we're comparing the proportions of two multinomial distributions. Before we can develop the method for conducting such a hypothesis test, that is, for comparing the proportions of two multinomial distributions, we first need to define some notation.
Notation Section
We'll use what I think most statisticians would consider standard notation, namely that:
- The letter i will index the h row categories, and
- The letter j will index the k column categories
(The text reverses the use of the i index and the j index.) That said, let's use the framework of the previous examples to introduce the notation we'll use. That is, rewrite the tables above using the following generic notation:
#(Acc) | Bus \(\left(j = 1 \right)\) | Eng \(\left(j = 2 \right)\) | L Arts \(\left(j = 3 \right)\) | Sci \(\left(j = 4 \right)\) | (FIXED) Total |
---|---|---|---|---|---|
M \(\left(i = 1 \right)\) | \(y_{11} \left(\hat{p}_{11} \right)\) | \(y_{12} \left(\hat{p}_{12} \right)\) | \(y_{13} \left(\hat{p}_{13} \right)\) | \(y_{14} \left(\hat{p}_{14} \right)\) | \(n_{1}=\sum_\limits{j=1}^{k} y_{1 j}\) |
F \(\left(i = 2 \right)\) | \(y_{21} \left(\hat{p}_{21} \right)\) | \(y_{22} \left(\hat{p}_{22} \right)\) | \(y_{23} \left(\hat{p}_{23} \right)\) | \(y_{24} \left(\hat{p}_{24} \right)\) | \(n_{2}=\sum_\limits{j=1}^{k} y_{2 j}\) |
Total | \(y_{11} + y_{21} \left(\hat{p}_1 \right)\) | \(y_{12} + y_{22} \left(\hat{p}_2 \right)\) | \(y_{13} + y_{23} \left(\hat{p}_3 \right)\) | \(y_{14} + y_{24} \left(\hat{p}_4 \right)\) | \(n_1 + n_2\) |
with:
- \(y_{ij}\) denoting the number falling into the \(j^{th}\) category of the \(i^{th}\) sample
- \(\hat{p}_{ij}=y_{ij}/n_i\)denoting the proportion in the \(i^{th}\) sample falling into the \(j^{th}\) category
- \(n_i=\sum_{j=1}^{k}y_{ij}\)denoting the total number in the \(i^{th}\) sample
- \( \hat{p}_{j}=(y_{1j}+y_{2j})/(n_1+n_2) \)denoting the (overall) proportion falling into the \(j^{th}\) category
With the notation defined as such, we are now ready to formulate the chi-square test statistic for testing the equality of two multinomial distributions.
The Chi-Square Test Statistic Section
Theorem
The chi-square test statistic for testing the equality of two multinomial distributions:
\(Q=\sum_{i=1}^{2}\sum_{j=1}^{k}\frac{(y_{ij}- n_i\hat{p}_j)^2}{n_i\hat{p}_j}\)
follows an approximate chi-square distribution with k−1 degrees of freedom. Reject the null hypothesis of equal proportions if Q is large, that is, if:
\(Q \ge \chi_{\alpha, k-1}^{2}\)
Proof
For the sake of concreteness, let's again use the framework of our example above to derive the chi-square test statistic. For one of the samples, say for the males, we know that:
\(\sum_{j=1}^{k}\frac{(\text{observed }-\text{ expected})^2}{\text{expected}}=\sum_{j=1}^{k}\frac{(y_{1j}- n_1p_{1j})^2}{n_1p_{1j}} \)
follows an approximate chi-square distribution with k−1 degrees of freedom. For the other sample, that is, for the females, we know that:
\(\sum_{j=1}^{k}\frac{(\text{observed }-\text{ expected})^2}{\text{expected}}=\sum_{j=1}^{k}\frac{(y_{2j}- n_2p_{2j})^2}{n_2p_{2j}} \)
follows an approximate chi-square distribution with k−1 degrees of freedom. Therefore, by the independence of two samples, we can "add up the chi-squares," that is:
\(\sum_{i=1}^{2}\sum_{j=1}^{k}\frac{(y_{ij}- n_ip_{ij})^2}{n_ip_{ij}}\)
follows an approximate chi-square distribution with k−1+ k−1 = 2(k−1) degrees of freedom.
Oops.... but we have a problem! The \(p_{ij}\)'s are unknown to us. Of course, we know by now that the solution is to estimate the \(p_{ij}\)'s. Now just how to do that? Well, if the null hypothesis is true, the proportions are equal, that is, if:
\(p_{11}=p_{21}, p_{21}=p_{22}, ... , p_{1k}=p_{2k} \)
we would be best served by using all of the data across the sample categories. That is, the best estimate for each\(j^{th}\) category is the pooled estimate:
\(\hat{p}_j=\frac{y_{1j}+y_{2j}}{n_1+n_2}\)
We also know by now that because we are estimating some paremeters, we have to adjust the degrees of freedom. The pooled estimates \(\hat{p}_j\) estimate the true unknown proportions \(p_{1j} = p_{2j} = p_j\). Now, if we know the first k−1 estimates, that is, if we know:
\(\hat{p}_1, \hat{p}_2, ... , \hat{p}_{k-1}\)
then the \(k^{th}\) one, that is \(\hat{p}_k\), is determined because:
\(\sum_{j=1}^{k}\hat{p}_j=1\)
That is:
\(\hat{p}_k=1-(\hat{p}_1+\hat{p}_2+ ... + \hat{p}_{k-1})\)
So, we are estimating k−1 parameters, and therefore we have to subtract k−1 from the degrees of freedom. Doing so, we get that
\(Q=\sum_{i=1}^{2}\sum_{j=1}^{k}\frac{(y_{ij}- n_i\hat{p}_j)^2}{n_i\hat{p}_j}\)
follows an approximate chi-square distribution with 2(k−1) − (k−1) = k − 1 degrees of freedom. As was to be proved!
Note Section
Our only example on this page has involved \(h = 2\) samples. If there are more than two samples, that is, if \(h > 2\), then the definition of the chi-square statistic is appropriately modified. That is:
\(Q=\sum_{i=1}^{h}\sum_{j=1}^{k}\frac{(y_{ij}- n_i\hat{p}_j)^2}{n_i\hat{p}_j}\)
follows an approximate chi-square distribution with \(h(k−1) − (k−1) = (h−1)(k − 1)\) degrees of freedom.
Let's take a look at another example.
Example 17-3 Section
The head of a surgery department at a university medical center was concerned that surgical residents in training applied unnecessary blood transfusions at a different rate than the more experienced attending physicians. Therefore, he ordered a study of the 49 Attending Physicians and 71 Residents in Training with privileges at the hospital. For each of the 120 surgeons, the number of blood transfusions prescribed unnecessarily in a one-year period was recorded. Based on the number recorded, a surgeon was identified as either prescribing unnecessary blood transfusions Frequently, Occasionally, Rarely, or Never. Here's a summary table (or "contingency table") of the resulting data:
Physician | Frequent | Occasionally | Rarely | Never | Total |
---|---|---|---|---|---|
Attending | 2 (4.1%) | 3 (6.1%) | 31 (63.3%) | 13 (26.5%) | 49 |
Resident | 15 (21.1%) | 28 (39.4%) | 23 (32.4%) | 5 (7.0%) | 71 |
Total | 17 | 31 | 54 | 18 | 120 |
Are attending physicians and residents in training distributed equally among the various unnecessary blood transfusion categories?
Answer
We are interested in testing the null hypothesis:
\(H_0 : p_{RF} =p_{AF} \text{ and } p_{RO} =p_{AO} \text{ and } p_{RR} =p_{AR} \text{ and } p_{RN} =p_{AN}\)
against the alternative hypothesis:
\(H_A : p_{RF} \ne p_{AF} \text{ or } p_{RO} \ne p_{AO} \text{ or } p_{RR} \ne p_{AR} \text{ or } p_{RN} \ne p_{AN}\)
The observed data were given to us in the table above. So, the next thing we need to do is find the expected counts for each cell of the table:
Physician | Frequent | Occasionally | Rarely | Never | Total |
---|---|---|---|---|---|
Attending | 49 | ||||
Resident | 71 | ||||
Total | 17 | 31 | 54 | 18 | 120 |
It is in the calculation of the expected values that you can readily see why we have (2−1)(4−1) = 3 degrees of freedom in this case. That's because, we only have to calculate three of the cells directly.
Physician | Frequent | Occasionally | Rarely | Never | Total |
---|---|---|---|---|---|
Attending | 6.942 | 12.658 | 22.05 | 49 | |
Resident | 71 | ||||
Total | 17 | 31 | 54 | 18 | 120 |
Once we do that, the remaining five cells can be calculated by way of subtraction:
Physician | Frequent | Occasionally | Rarely | Never | Total |
---|---|---|---|---|---|
Attending | 6.942 | 12.658 | 22.05 | 7.35 | 49 |
Resident | 10.058 | 18.342 | 31.95 | 10.65 | 71 |
Total | 17 | 31 | 54 | 18 | 120 |
Now that we have the observed and expected counts, calculating the chi-square statistic is a straightforward exercise:
\(Q=\frac{(2-6.942)^2}{6.942}+ ... +\frac{(5-10.65)^2}{10.65} =31.88 \)
The chi-square test tells us to reject the null hypothesis, at the 0.05 level, if Q is greater than a chi-square random variable with 3 degrees of freedom, that is, if \(Q > 7.815\). Because \(Q = 31.88 > 7.815\), we reject the null hypothesis. There is sufficient evidence at the 0.05 level to conclude that the distribution of unnecessary transfusions differs among attending physicians and residents.
Minitab®
Using Minitab Section
If you...
- Enter the data (in the inside of the frequency table only) into the columns of the worksheet
- Select Stat >> Tables >> Chi-square test
then you'll get typical chi-square test output that looks something like this:
Freq | Occ | Rare | Never | Total | |
---|---|---|---|---|---|
1 | 2 6.94 |
3 12.66 |
31 22.05 |
13 7.35 |
49 |
2 | 15 10.06 |
28 18.34 |
23 31.95 |
5 10.65 |
71 |
Total | 17 | 31 | 54 | 18 | 120 |
Chi- sq = 3.518 + 7.369 + 3.633 + 4.343 +
2.428 + 5.086 + 2.507 + 2.997 = 31.881
DF = 3, P-Value = 0.000