# 11.1.1 - The Sign Test

11.1.1 - The Sign TestSuppose we are interested in testing the population median. The hypotheses are similar to those we have seen before but use the median, \(\eta\), instead of the mean.

If the hypotheses are based on the median, they would look like the following:

#### Null

- \(H_0\colon\eta=\eta_0\)

#### Alternative

- \(H_a\colon\eta>\eta_0\)
- \(H_a\colon\eta<\eta_0\)
- \(H_a\colon\eta\ne\eta_0\)

For the IRS example, the null and alternative are:

\(H_0\colon \eta=160\) vs \(H_a\colon \eta >160\)

Consider the test statistic, \(S^+\), where

\(S^+=\text{the number of observations greater than 160}\)

Under the null hypothesis, \(S^+\), should be about 50% of the observations. Therefore, \(S^+\) should have a binomial distribution with parameters \(n\) and \(p=0.5\). Let’s review and verify that it is a Binomial random variable.

**The number of trials, \(n\), is fixed and known**. Here the number of trials equals the number of observations. Therefore, in this case, \(n\) is fixed and known.**The outcomes of each trial can be categorized as either a "success" or a "failure", with the probability of success being \(p\).**Observations can either be above the median (a "success") or below the median (a "failure") with the probability of being above the median being \(p=½\).**The probability of "success" remains constant from trial to trial.**The probability of being above the median remains the same for each observation.**The trials are independent.**Each of the observations is independent of the next, so we are okay here.

Now, back to our problem. To make a conclusion, we need to find the p-value. It is the probability of seeing what we see or something more extreme given the null hypothesis is true.

In the IRS example, let’s find \(S^+\), or in other words, let's find the number of observations that fall above 160. We find \(S^+=15\).

Using the Binomial distribution function, we can find the p-value as \(P(S^+\ge 15)\):

\begin{align} P(X\ge15)&=\sum_{i=15}^{30} {30\choose i}(0.5)^{30-i}(0.5)^i\\&=\sum_{i=15}^{30} {30\choose i} (0.5)^{30}\\&\approx 0.5722322 \end{align}

If we assume the significance level is 5%, then the p-value\(>0.05\). We would fail to reject the null hypothesis and conclude that there is no evidence in the data to suggest that the median is above 160 minutes.

This test is called the **Sign Test** and \(S^+\) is called the **sign statistic**. The Sign Test is also known as the Binomial Test.

Let's recap what we found. The research question was to see if it took longer than 160 minutes to complete the 1040 form. The measurement was the time in minutes to complete the form. Here is a summary:

Assumption |
Test Statistic |
p-value |
Conclusion |
---|---|---|---|

Data are Normally distributed |
t statistic (\(t^*)\) |
0.04991 |
Reject \(H_0\) |

Quantitative Data |
Sign statistic \((S^+)\) |
0.572 |
Fail to reject \(H_0\) |

Here, we have two opposite conclusions from each of the tests. Given the shape of the data, which do you think is the valid conclusion?

##
Minitab^{®}

## Minitab Sign Test

We can use Minitab to conduct the Sign test.

- Click
`Stat`>`Nonparametrics`>`1-Sample sign` - Enter your 'variable', 'significance level', and adjust for the alternative.
- Click
`OK`.

## Example 11-2: Tax Forms (Sign Test)

Conduct the test for the median time for filling out the tax forms using the Sign Test in Minitab. Download the dataset: [irs.txt]

Conducting the test in Minitab yields the following output.

**1-Sample Sign Test**

Method

\(\eta\): median of time

**Descriptive Statistics**

N |
Median |

30 |
164 |

**95% Lower bound for \(\eta\)**

Lower Bound for \(\eta\) | Achieved Confidence | Position |

120.000 | 89.98% |
12 |

116.085 | 95.00% | Interpolation |

116.000 | 95.06% | 11 |

**Test**

Alternative hypothesis

_{o}: \(\eta\) = 0

H_{1}: \(\eta\) > 0

Number < 160 | Number = 160 | Number > 160 | P-Value |
---|---|---|---|

15 | 0 | 15 | 0.5722 |

As you can see in the Minitab output, you can also find a confidence interval for the population median based on the sign statistic. As you can imagine, finding the confidence interval by hand is a bit tricky. The interpretation of the confidence interval for the median has the same template interpretation as the confidence interval for the population mean.

We present the details of the Sign Test because it can be found based on the material we covered so far in the course. For the next section, we present another test and how to do it in Minitab but leave out the details.