Exploratory Analysis - 1

Test the Equality of Two Proportions

The SAS program water.sas provides the following frequency table (and others) of the water level study data:

sas output

Why was the passing rate so low? What factors affect passing?

In the past statisticians have used ordinary regression when experiments involved categorical data. Wouldn't it be interesting to see how bad an ordinary regression analysis is compared to using logistic regression?

First we could run a Pearson Chi-Square to test the equality of two proportions. Our hypothesis at this stage is that the proportion of males passing is the same as the proportion of females that passed. As the frequency table above reports, the observed percentage of females who passed is 29.91% and the observed proportion of males who passed is 64.41%.

When we look at the Pearson Chi-squared test of equality of two proportions we would find a Chi-Square value of 18.562, p-value = 0.000.

SAS output

This is highly significant, (because the p-value is also < 0.05), so, we reject the hypothesis that the proportions passing are the same for females and males.