11.1 - Inference for the Population Median

Introduction

So far, the methods we learned were for the population mean. The mean is a good measure of center when the data is bell-shaped, but it is sensitive to outliers and extreme values. When the data is skewed, however, a better measure of center would be the median. The median, you may recall, is a resistant measure. We present an example below that demonstrates why we might consider an alternative method than the one presented so far. In other words, we may want to consider a test for the median and not the mean.

Example 11.1: Tax Forms

The Internal Revenue Service (IRS) claims that it should typically take about 160 minutes to fill out a 1040 tax form. A researcher believes that the IRS's claim is not correct and that it generally takes people longer to complete the form. He recorded the time (in minutes) it took 30 individuals to complete the form. Download the data set: [irs.txt]

How would we approach this using previous methods? We would set the hypotheses as:

\(H_0\colon \mu=160\)

\( H_a\colon \mu>160\)

If we run the analysis in Minitab, we get the following output:

Descriptive Statistics

N	Mean	StDev	SE Mean	95% Lower Bound for μ
30	209.40	159.15	29.06	160.03

μ: mean of time

Test

Null hypothesis

Alternative hypothesis

H_o: μ = 160

H₁: μ > 160

T-Value	P-Value
1.70	0.0499

The output here gives the \(t\) statistic (1.7001), the degrees of freedom (29) and the p-value (0.04991). In this case, the p-value is less than our significance level, \(\alpha=0.05\). Therefore we reject the null hypothesis and conclude that it takes on average longer than 160 to complete the 1040 form.

We assumed time to fill out the form was Normally distributed (or at least symmetric) BUT time is not Normally distributed (symmetric). It is generally skewed with a long right tail. Let’s take a look at the data. Below is the histogram of the data.

Histogram showing the time in minutes of completing the IRS tax forms. The highest bar has a frequency of 10 at 100 minutes.

As you can see from the histogram, the shape of the data does not support the assumption that time is Normally distributed. There is a clear right tail here. In a skewed distribution, the population median, typically denoted as \(\eta\), is a better typical value than the population mean, \(\mu\).

11.1.1 - The Sign Test

Suppose we are interested in testing the population median. The hypotheses are similar to those we have seen before but use the median, \(\eta\), instead of the mean.

If the hypotheses are based on the median, they would look like the following:

Null

\(H_0\colon\eta=\eta_0\)

Alternative

\(H_a\colon\eta>\eta_0\)

\(H_a\colon\eta<\eta_0\)

\(H_a\colon\eta\ne\eta_0\)

For the IRS example, the null and alternative are:

\(H_0\colon \eta=160\) vs \(H_a\colon \eta >160\)

Consider the test statistic, \(S^+\), where

\(S^+=\text{the number of observations greater than 160}\)

Under the null hypothesis, \(S^+\), should be about 50% of the observations. Therefore, \(S^+\) should have a binomial distribution with parameters \(n\) and \(p=0.5\). Let’s review and verify that it is a Binomial random variable.

The number of trials, \(n\), is fixed and known. Here the number of trials equals the number of observations. Therefore, in this case, \(n\) is fixed and known.
The outcomes of each trial can be categorized as either a "success" or a "failure", with the probability of success being \(p\). Observations can either be above the median (a "success") or below the median (a "failure") with the probability of being above the median being \(p=½\).
The probability of "success" remains constant from trial to trial. The probability of being above the median remains the same for each observation.
The trials are independent. Each of the observations is independent of the next, so we are okay here.

Now, back to our problem. To make a conclusion, we need to find the p-value. It is the probability of seeing what we see or something more extreme given the null hypothesis is true.

In the IRS example, let’s find \(S^+\), or in other words, let's find the number of observations that fall above 160. We find \(S^+=15\).

Using the Binomial distribution function, we can find the p-value as \(P(S^+\ge 15)\):

\begin{align} P(X\ge15)&=\sum_{i=15}^{30} {30\choose i}(0.5)^{30-i}(0.5)^i\\&=\sum_{i=15}^{30} {30\choose i} (0.5)^{30}\\&\approx 0.5722322 \end{align}

If we assume the significance level is 5%, then the p-value\(>0.05\). We would fail to reject the null hypothesis and conclude that there is no evidence in the data to suggest that the median is above 160 minutes.

This test is called the Sign Test and \(S^+\) is called the sign statistic. The Sign Test is also known as the Binomial Test.

Let's recap what we found. The research question was to see if it took longer than 160 minutes to complete the 1040 form. The measurement was the time in minutes to complete the form. Here is a summary:

Assumption	Test Statistic	p-value	Conclusion
Data are Normally distributed	t statistic (\(t^*)\)	0.04991	Reject \(H_0\)
Quantitative Data	Sign statistic \((S^+)\)	0.572	Fail to reject \(H_0\)

Here, we have two opposite conclusions from each of the tests. Given the shape of the data, which do you think is the valid conclusion?

Minitab^®

Minitab Sign Test

We can use Minitab to conduct the Sign test.

Click Stat > Nonparametrics > 1-Sample sign
Enter your 'variable', 'significance level', and adjust for the alternative.
Click OK .

Example 11-2: Tax Forms (Sign Test)

Conduct the test for the median time for filling out the tax forms using the Sign Test in Minitab. Download the dataset: [irs.txt]

Answer

Conducting the test in Minitab yields the following output.

1-Sample Sign Test

Method

\(\eta\): median of time

Descriptive Statistics

N	Median
30	164

95% Lower bound for \(\eta\)

Lower Bound for \(\eta\)	Achieved Confidence	Position
120.000	89.98%	12
116.085	95.00%	Interpolation
116.000	95.06%	11

Test

Null hypothesis

Alternative hypothesis

H_o: \(\eta\) = 160

H₁: \(\eta\) > 160

Number < 160	Number = 160	Number > 160	P-Value
15	0	15	0.5722

You can see the p-value and \(S^+\), which is the number greater than 160.

As you can see in the Minitab output, you can also find a confidence interval for the population median based on the sign statistic. As you can imagine, finding the confidence interval by hand is a bit tricky. The interpretation of the confidence interval for the median has the same template interpretation as the confidence interval for the population mean.

We present the details of the Sign Test because it can be found based on the material we covered so far in the course. For the next section, we present another test and how to do it in Minitab but leave out the details.

11.1.2 - One-Sample Wilcoxon

In this section, we briefly present the one-sample Wilcoxon test. This test was developed by Frank Wilcoxon in 1945. It is considered one of the first “nonparametric” tests developed.

The hypotheses are the similar to the ones presented previously for the Sign Test:

Null

\(H_0\colon \eta=\eta_0\)

Alternative

\(H_a\colon \eta>\eta_0\)

\(H_a\colon \eta<\eta_0\)

\(H_a\colon \eta\ne\eta_0\)

The Wilcoxon test needs additional assumptions, however. They are:

The random variable of interest is continuous
The probability distribution of the population is symmetric.

If we compare the assumptions of the Wilcoxon test to the Sign Test, the Wilcoxon test requires the distribution to be symmetric. For example, we should not be making conclusions for the IRS data using the Wilcoxon test because the data is right-skewed.

The test statistic is typically denoted as \(W\). We will not go into details on how this statistic is found as it involves ranks.

Minitab^®

One-Sample Wilcoxon Test

Minitab will conduct the one-sample Wilcoxon test.

Choose Stat > Nonparametrics > 1-sample Wilcoxon
Enter the 'variable', the 'hypothesized value', and the correct 'alternative'.
Choose OK .

Example 11-3: Checkout Time (Wilcoxon Test)

Fresh N Friendly food store advertises that their checkout waiting times is four minutes or less. An angry customer wants to dispute this claim. He takes a random sample of shoppers at the peak time and records their checkout times. Can he dispute their claim at significance level 10%?

Checkout times:

3.8, 5.3, 3.5, 4.5, 7.2, 5.1

Use Minitab to conduct the 1-sample Wilcoxon Test. Compare the conclusion to the one found using the one-sample t-test. Lesson 6b.4 More Examples

Answer

The hypotheses are \(H_0:\eta=4\) versus \(H_0:\eta>4\). The sample size is small here and may not even be reasonable to assume that the data is symmetric. We get the following output from Minitab:

Wilcoxon Signed Rank Test: time

Method

\(\eta\): median of time

Descriptive Statistics

Sample	N	Median
time	6	4.8

Test

Null hypothesis

Alternative hypothesis

H_o: \(\eta\) = 4

H₁: \(\eta\) > 4

N for Wilcoxon

Sample	N for Test	Wilcoxon Statistic	P-Value
time	6	17.50	0.086

The p-value for this test is 0.086. The p-value is less than our significance level and therefore we reject the null hypothesis. There is enough evidence in the data to suggest the population median time is greater than 4.

If we assume the data are normal and perform a test for the mean, the p-value was 0.0798.

At the 10% level, the data suggest that both the mean and the median are greater than 4.

11.1.3 - Other Nonparametric Tests

So far we discussed nonparametric tests for only one parameter. There are many tests for two parameters and for more than two parameters. There are also tests like Fisher’s Exact that will test for the association between two categorical variables.

In the table below, we give some examples of nonparametric tests. If you are interested, explore these tests on your own.

Name of Test	What is used for
Mann-Whitney	Test for two medians
Wilcoxon Rank-Sum Test	Test for two paired medians
Kruskal-Wallis	Test for more than two treatments
Mood’s Median Test	Test for more than two treatments
Friedman’s Test	Repeated Measures

^[1]	Link
↥	Has Tooltip/Popover
	Toggleable Visibility

11.1 - Inference for the Population Median

Introduction

Example 11.1: Tax Forms

Descriptive Statistics

Test

11.1.1 - The Sign Test

Null

Alternative

Minitab®

Minitab Sign Test

Example 11-2: Tax Forms (Sign Test)

1-Sample Sign Test

Descriptive Statistics

95% Lower bound for \(\eta\)

Test

11.1.2 - One-Sample Wilcoxon

Null

Alternative

Minitab®

One-Sample Wilcoxon Test

Example 11-3: Checkout Time (Wilcoxon Test)

Wilcoxon Signed Rank Test: time

Method

Descriptive Statistics

Test

11.1.3 - Other Nonparametric Tests

Minitab^®

Minitab^®