# 11.3.2 - Minitab: Test of Independence

11.3.2 - Minitab: Test of Independence

## Raw vs Summarized Data

If you have a data file with the responses for individual cases then you have "raw data" and can follow the directions below. If you have a table filled with data, then you have "summarized data." There is an example of conducting a chi-square test of independence using summarized data on a later page. After data entry the procedure is the same for both data entry methods.

## Minitab® – Chi-square Test Using Raw Data

Research question: Is there a relationship between where a student sits in class and whether they have ever cheated?

• Null hypothesis: Seat location and cheating are not related in the population.
• Alternative hypothesis: Seat location and cheating are related in the population.

To perform a chi-square test of independence in Minitab using raw data:

1. Open Minitab file: class_survey.mpx
2. Select Stat > Tables > Chi-Square Test for Association
3. Select Raw data (categorical variables) from the dropdown.
4. Choose the variable Seating to insert it into the Rows box
5. Choose the variable Ever_Cheat to insert it into the Columns box
6. Click the Statistics button and check the boxes Chi-square test for association and Expected cell counts
7. Click OK and OK

This should result in the following output:

##### Rows: Seating Columns: Ever_Cheat
No Yes All 24 8 32 24.21 7.79 38 8 46 34.81 11.19 109 39 148 111.98 36.02 1714 55 226
##### Chi-Square Test
Chi-Square DF P-Value 1.539 2 0.463 1.626 2 0.443

#### Interpret

All expected values are at least 5 so we can use the Pearson chi-square test statistic. Our results are $$\chi^2 (2) = 1.539$$. $$p = 0.463$$. Because our $$p$$ value is greater than the standard alpha level of 0.05, we fail to reject the null hypothesis. There is not enough evidence of a relationship in the population between seat location and whether a student has cheated.

# 11.3.2.1 - Example: Raw Data

11.3.2.1 - Example: Raw Data

## Example: Dog & Cat Ownership

Is there a relationship between dog and cat ownership in the population of all World Campus STAT 200 students? Let's conduct an hypothesis test using the dataset: fall2016stdata.mpx

1. Check any necessary assumptions and write null and alternative hypotheses.

$$H_0:$$ There is not a relationship between dog ownership and cat ownership in the population of all World Campus STAT 200 students
$$H_a:$$ There is a relationship between dog ownership and cat ownership in the population of all World Campus STAT 200 students

Assumption: All expected counts are at least 5. The expected counts here are 176.02, 75.98, 189.98, and 82.02, so this assumption has been met.

2. Calculate an appropriate test statistic.

Let's use Minitab to calculate the test statistic and p-value.

1. After entering the data, select Stat > Tables > Cross Tabulation and Chi-Square
2. Enter Dog in the Rows box
3. Enter Cat in the Columns box
4. Select the Chi-Square button and in the new window check the box for the Chi-square test and Expected cell counts
5. Click OK and OK
No Yes All
No 183 69 252
176.02 75.98
Yes 183 89 272
189.98 82.02
Missing 1 0
All 366 158 524
##### Chi-Square Test
Chi-Square DF P-Value
Pearson 1.771 1 0.183
Likelihood Ratio 1.775 1 0.183

Since the assumption was met in step 1, we can use the Pearson chi-square test statistic.

$$Pearson\;\chi^2 = 1.771$$

3. Determine a p value associated with the test statistic.

$$p = 0.183$$

4. Decide between the null and alternative hypotheses.

Our p value is greater than the standard 0.05 alpha level, so we fail to reject the null hypothesis.

5. State a "real world" conclusion.

There is not enough evidence of a relationship between dog ownership and cat ownership in the population of all World Campus STAT 200 students.

# 11.3.2.2 - Example: Summarized Data

11.3.2.2 - Example: Summarized Data

## Example: Coffee and Tea Preference

Is there a relationship between liking tea and liking coffee?

The following table shows data collected from a random sample of 100 adults. Each were asked if they liked coffee (yes or no) and if they liked tea (yes or no).

Likes Coffee Yes 30 25 10 35

Let's use the 5 step hypothesis testing procedure to address this research question.

1. Check any necessary assumptions and write null and alternative hypotheses.

$$H_0:$$ Liking coffee an liking tea are not related (i.e., independent) in the population
$$H_a:$$ Liking coffee and liking tea are related (i.e., dependent) in the population

Assumption: All expected counts are at least 5.

2. Calculate an appropriate test statistic.

Let's use Minitab to calculate the test statistic and p-value.

1. Enter the table into a Minitab worksheet as shown below:
C1 C2 C3 Yes 30 25 No 10 35
2. Select Stat > Tables > Cross Tabulation and Chi-Square
3. Select Summarized data in a two-way table from the dropdown
4. Enter the columns Likes Coffee-Yes and Likes Coffee-No in the Columns containing the table box
5. For the row labels enter Likes Tea (leave the column labels blank)
6. Select the Chi-Square button and check the boxes for Chi-square test and Expected cell counts.
7. Click OK and OK

Output

##### Rows: Likes Tea  Columns: Worksheet columns
No Yes All 30 25 55 22 33 10 35 45 18 27 40 60 100
##### Chi-Square Test
Chi-Square DF P-Value 10.774 1 0.001 11.138 1 0.001

Since the assumption was met in step 1, we can use the Pearson chi-square test statistic.

$$Pearson\;\chi^2 = 10.774$$

3. Determine a p value associated with the test statistic.

$$p = 0.001$$

4. Decide between the null and alternative hypotheses.

Our p value is less than the standard 0.05 alpha level, so we reject the null hypothesis.

5. State a "real world" conclusion.

There is evidence of a relationship between between liking coffee and liking tea in the population.

  Link ↥ Has Tooltip/Popover Toggleable Visibility