17.1 - Test For Homogeneity

As suggested in the introduction to this lesson, the test for homogeneity is a method, based on the chi-square statistic, for testing whether two or more multinomial distributions are equal. Let's start by trying to get a feel for how our data might "look" if we have two equal multinomial distributions.

Example 17-1 Section

old main

A university admissions officer was concerned that males and females were accepted at different rates into the four different schools (business, engineering, liberal arts, and science) at her university. She collected the following data on the acceptance of 1200 males and 800 females who applied to the university:

#(Acceptances) Business Engineer Lib Arts Science (FIXED) Total
Male 300 (25%) 240 (20%) 300 (25%) 360 (30%) 1200
Female 200 (25%) 160 (20%) 200 (25%) 240 (30%) 800
Total 500 (25%) 400 (20%) 500 (25%) 600 (30%) 2000

Are males and females distributed equally among the various schools?

Answer

Let's start by focusing on the business school. We can see that, of the 1200 males who applied to the university, 300 (or 25%) were accepted into the business school. Of the 800 females who applied to the university, 200 (or 25%) were accepted into the business school. So, the business school looks to be in good shape, as an equal percentage of males and females, namely 25%, were accepted into it.

Now, for the engineering school. We can see that, of the 1200 males who applied to the university, 240 (or 20%) were accepted into the engineering school. Of the 800 females who applied to the university, 160 (or 20%) were accepted into the engineering school. So, the engineering school also looks to be in good shape, as an equal percentage of males and females, namely 20%, were accepted into it.

We probably don't have to drag this out any further. If we look at each column in the table, we see that the proportion of males accepted into each school is the same as the proportion of females accepted into each school... which therefore happens to equal the proportion of students accepted into each school, regardless of gender. Therefore, we can conclude that males and females are distributed equally among the four schools.

Example 17-2 Section

university campus

A university admissions officer was concerned that males and females were accepted at different rates into the four different schools (business, engineering, liberal arts, and science) at her university. She collected the following data on the acceptance of 1200 males and 800 females who applied to the university:

#(Acceptances) Business Engineer Lib Arts Science (FIXED) Total
Male 240 (20%) 480 (40%) 120 (10%) 360 (30%) 1200
Female 240 (30%) 80 (10%) 320 (40%) 160 (20%) 800
Total 480 (24%) 560 (28%) 440 (22%) 520 (26%) 2000

Are males and females distributed equally among the various schools?

Answer

Let's again start by focusing on the business school. In this case, of the 1200 males who applied to the university, 240 (or 20%) were accepted into the business school. And, of the 800 females who applied to the university, 240 (or 30%) were accepted into the business school. So, the business school appears to have different rates of acceptance for males and females, 20% compared to 30%.

Now, for the engineering school. We can see that, of the 1200 males who applied to the university, 480 (or 40%) were accepted into the engineering school. Of the 800 females who applied to the university, only 80 (or 10%) were accepted into the engineering school. So, the engineering school also appears to have different rates of acceptance for males and females, 40% compared to 10%.

Again, there's no need drag this out any further. If we look at each column in the table, we see that the proportion of males accepted into each school is different than the proportion of females accepted into each school... and therefore the proportion of students accepted into each school, regardless of gender, is different than the proportion of males and females accepted into each school. Therefore, we can conclude that males and females are not distributed equally among the four schools.

In the context of the two examples above, it quickly becomes apparent that if we wanted to formally test the hypothesis that males and females are distributed equally among the four schools, we'd want to test the hypotheses:

\(H_0 : p_{MB} =p_{FB} \text{ and } p_{ME} =p_{FE} \text{ and } p_{ML} =p_{FL} \text{ and } p_{MS} =p_{FS}\)
\(H_A : p_{MB} \ne p_{FB} \text{ or } p_{ME} \ne p_{FE} \text{ or } p_{ML} \ne p_{FL} \text{ or } p_{MS} \ne p_{FS}\)

where:

  1. \(p_{Mj}\) is the proportion of males accepted into school j = B, E, L, or S
  2. \(p_{Fj}\) is the proportion of females accepted into school j = B, E, L, or S

In conducting such a hypothesis test, we're comparing the proportions of two multinomial distributions. Before we can develop the method for conducting such a hypothesis test, that is, for comparing the proportions of two multinomial distributions, we first need to define some notation.

Notation Section

hieroglyphics

We'll use what I think most statisticians would consider standard notation, namely that:

  1. The letter i will index the h row categories, and
  2. The letter j will index the k column categories

(The text reverses the use of the i index and the j index.) That said, let's use the framework of the previous examples to introduce the notation we'll use. That is, rewrite the tables above using the following generic notation:

#(Acc) Bus \(\left(j = 1 \right)\) Eng \(\left(j = 2 \right)\) L Arts \(\left(j = 3 \right)\) Sci \(\left(j = 4 \right)\) (FIXED) Total
M \(\left(i = 1 \right)\) \(y_{11} \left(\hat{p}_{11} \right)\) \(y_{12} \left(\hat{p}_{12} \right)\) \(y_{13} \left(\hat{p}_{13} \right)\) \(y_{14} \left(\hat{p}_{14} \right)\) \(n_{1}=\sum_\limits{j=1}^{k} y_{1 j}\)
F \(\left(i = 2 \right)\) \(y_{21} \left(\hat{p}_{21} \right)\) \(y_{22} \left(\hat{p}_{22} \right)\) \(y_{23} \left(\hat{p}_{23} \right)\) \(y_{24} \left(\hat{p}_{24} \right)\) \(n_{2}=\sum_\limits{j=1}^{k} y_{2 j}\)
Total \(y_{11} + y_{21} \left(\hat{p}_1 \right)\) \(y_{12} + y_{22} \left(\hat{p}_2 \right)\) \(y_{13} + y_{23} \left(\hat{p}_3 \right)\) \(y_{14} + y_{24} \left(\hat{p}_4 \right)\) \(n_1 + n_2\)

with:

  1. \(y_{ij}\) denoting the number falling into the \(j^{th}\) category of the \(i^{th}\) sample
  2. \(\hat{p}_{ij}=y_{ij}/n_i\)denoting the proportion in the \(i^{th}\) sample falling into the \(j^{th}\) category
  3. \(n_i=\sum_{j=1}^{k}y_{ij}\)denoting the total number in the \(i^{th}\) sample
  4. \( \hat{p}_{j}=(y_{1j}+y_{2j})/(n_1+n_2) \)denoting the (overall) proportion falling into the \(j^{th}\) category

With the notation defined as such, we are now ready to formulate the chi-square test statistic for testing the equality of two multinomial distributions.

The Chi-Square Test Statistic Section

Theorem

The chi-square test statistic for testing the equality of two multinomial distributions:

\(Q=\sum_{i=1}^{2}\sum_{j=1}^{k}\frac{(y_{ij}- n_i\hat{p}_j)^2}{n_i\hat{p}_j}\)

follows an approximate chi-square distribution with k−1 degrees of freedom. Reject the null hypothesis of equal proportions if Q is large, that is, if:

\(Q \ge \chi_{\alpha, k-1}^{2}\)

Proof

For the sake of concreteness, let's again use the framework of our example above to derive the chi-square test statistic. For one of the samples, say for the males, we know that:

\(\sum_{j=1}^{k}\frac{(\text{observed }-\text{ expected})^2}{\text{expected}}=\sum_{j=1}^{k}\frac{(y_{1j}- n_1p_{1j})^2}{n_1p_{1j}} \)

follows an approximate chi-square distribution with k−1 degrees of freedom. For the other sample, that is, for the females, we know that:

\(\sum_{j=1}^{k}\frac{(\text{observed }-\text{ expected})^2}{\text{expected}}=\sum_{j=1}^{k}\frac{(y_{2j}- n_2p_{2j})^2}{n_2p_{2j}} \)

follows an approximate chi-square distribution with k−1 degrees of freedom. Therefore, by the independence of two samples, we can "add up the chi-squares," that is:

\(\sum_{i=1}^{2}\sum_{j=1}^{k}\frac{(y_{ij}- n_ip_{ij})^2}{n_ip_{ij}}\)

follows an approximate chi-square distribution with k−1+ k−1 = 2(k−1) degrees of freedom.

Oops.... but we have a problem! The \(p_{ij}\)'s are unknown to us. Of course, we know by now that the solution is to estimate the \(p_{ij}\)'s. Now just how to do that? Well, if the null hypothesis is true, the proportions are equal, that is, if:

\(p_{11}=p_{21}, p_{21}=p_{22}, ... , p_{1k}=p_{2k} \)

we would be best served by using all of the data across the sample categories. That is, the best estimate for each\(j^{th}\) category is the pooled estimate:

\(\hat{p}_j=\frac{y_{1j}+y_{2j}}{n_1+n_2}\)

We also know by now that because we are estimating some paremeters, we have to adjust the degrees of freedom. The pooled estimates \(\hat{p}_j\) estimate the true unknown proportions \(p_{1j} = p_{2j} = p_j\). Now, if we know the first k−1 estimates, that is, if we know:

\(\hat{p}_1, \hat{p}_2, ... , \hat{p}_{k-1}\)

then the \(k^{th}\) one, that is \(\hat{p}_k\), is determined because:

\(\sum_{j=1}^{k}\hat{p}_j=1\)

That is:

\(\hat{p}_k=1-(\hat{p}_1+\hat{p}_2+ ... + \hat{p}_{k-1})\)

So, we are estimating k−1 parameters, and therefore we have to subtract k−1 from the degrees of freedom. Doing so, we get that

\(Q=\sum_{i=1}^{2}\sum_{j=1}^{k}\frac{(y_{ij}- n_i\hat{p}_j)^2}{n_i\hat{p}_j}\)

follows an approximate chi-square distribution with 2(k−1) − (k−1) = k − 1 degrees of freedom. As was to be proved!

Note Section

Our only example on this page has involved \(h = 2\) samples. If there are more than two samples, that is, if \(h > 2\), then the definition of the chi-square statistic is appropriately modified. That is:

\(Q=\sum_{i=1}^{h}\sum_{j=1}^{k}\frac{(y_{ij}- n_i\hat{p}_j)^2}{n_i\hat{p}_j}\)

follows an approximate chi-square distribution with \(h(k−1) − (k−1) = (h−1)(k − 1)\) degrees of freedom.

Let's take a look at another example.

Example 17-3 Section

surgery

The head of a surgery department at a university medical center was concerned that surgical residents in training applied unnecessary blood transfusions at a different rate than the more experienced attending physicians. Therefore, he ordered a study of the 49 Attending Physicians and 71 Residents in Training with privileges at the hospital. For each of the 120 surgeons, the number of blood transfusions prescribed unnecessarily in a one-year period was recorded. Based on the number recorded, a surgeon was identified as either prescribing unnecessary blood transfusions Frequently, Occasionally, Rarely, or Never. Here's a summary table (or "contingency table") of the resulting data:

Physician Frequent Occasionally Rarely Never Total
Attending 2 (4.1%) 3 (6.1%) 31 (63.3%) 13 (26.5%) 49
Resident 15 (21.1%) 28 (39.4%) 23 (32.4%) 5 (7.0%) 71
Total 17 31 54 18 120

Are attending physicians and residents in training distributed equally among the various unnecessary blood transfusion categories?

Answer

We are interested in testing the null hypothesis:

\(H_0 : p_{RF} =p_{AF} \text{ and } p_{RO} =p_{AO} \text{ and } p_{RR} =p_{AR} \text{ and } p_{RN} =p_{AN}\)

against the alternative hypothesis:

\(H_A : p_{RF} \ne p_{AF} \text{ or } p_{RO} \ne p_{AO} \text{ or } p_{RR} \ne p_{AR} \text{ or } p_{RN} \ne p_{AN}\)

The observed data were given to us in the table above. So, the next thing we need to do is find the expected counts for each cell of the table:

Physician Frequent Occasionally Rarely Never Total
Attending         49
Resident         71
Total 17 31 54 18 120

It is in the calculation of the expected values that you can readily see why we have (2−1)(4−1) = 3 degrees of freedom in this case. That's because, we only have to calculate three of the cells directly.

Physician Frequent Occasionally Rarely Never Total
Attending 6.942 12.658 22.05   49
Resident         71
Total 17 31 54 18 120

Once we do that, the remaining five cells can be calculated by way of subtraction:

Physician Frequent Occasionally Rarely Never Total
Attending 6.942 12.658 22.05 7.35 49
Resident 10.058 18.342 31.95 10.65 71
Total 17 31 54 18 120

Now that we have the observed and expected counts, calculating the chi-square statistic is a straightforward exercise:

\(Q=\frac{(2-6.942)^2}{6.942}+ ... +\frac{(5-10.65)^2}{10.65} =31.88 \)

The chi-square test tells us to reject the null hypothesis, at the 0.05 level, if Q is greater than a chi-square random variable with 3 degrees of freedom, that is, if \(Q > 7.815\). Because \(Q = 31.88 > 7.815\), we reject the null hypothesis. There is sufficient evidence at the 0.05 level to conclude that the distribution of unnecessary transfusions differs among attending physicians and residents.

Minitab 18

Minitab®

Using Minitab Section

If you...

  1. Enter the data (in the inside of the frequency table only) into the columns of the worksheet
  2. Select Stat >> Tables >> Chi-square test

then you'll get typical chi-square test output that looks something like this:

Chi-Square Test: Freq, Occ, Rare, Never
Expected counts are printed below observed counts
  Freq Occ Rare Never Total
1  2
6.94
3
12.66
31
22.05
13
7.35
49
2  15
10.06
28
18.34
23
31.95
5
10.65
71
Total 17 31 54 18 120

Chi- sq = 3.518 + 7.369 + 3.633 + 4.343 +

                2.428 + 5.086 + 2.507 + 2.997 = 31.881

DF = 3, P-Value = 0.000