14.1 - An Example

Example 14-1 Section

A physiologist was interested in learning whether smoking history and different types of stress tests influence the timing of a subject's maximum oxygen uptake, as measured in minutes. The researcher classified a subject's smoking history as either heavy smoking, moderate smoking, or non-smoking. He was interested in seeing the effects of three different types of stress tests — a test performed on a bicycle, a test on a treadmill, and a test on steps. The physiologist recruited 9 non-smokers, 9 moderate smokers, and 9 heavy smokers to participate in his experiment, for a total of n = 27 subjects. He then randomly assigned each of his recruited subjects to undergo one of the three types of stress tests. Here is his resulting data:

 Smoking History Test
Bicycle (1) Treadmill (2) Step Test (3)
 Nonsmoker (1) 12.8, 13.5, 11.2 16.2, 18.1, 17.8 22.6, 19.3, 18.9
 Moderate (2) 10.9, 11.1, 9.8 15.5, 13.8, 16.2 20.1, 21.0, 15.9
 Heavy (3) 8.7, 9.2, 7.5 14.7, 13.2, 8.1 16.2, 16.1, 17.8

Is there sufficient evidence at the \(\alpha = 0.05\) level to conclude that smoking history has an effect on the time to maximum oxygen uptake? Is there sufficient evidence at the \(\alpha = 0.05\) level to conclude that the type of stress test has an effect on the time to maximum oxygen uptake? And, is there evidence of an interaction between smoking history and the type of stress test?

Answer

Let's start by stating our analysis of variance model, as well as any assumptions that we'll make. Let \(X_{ijk}\) denote the time, in minutes, until maximum oxygen uptake for smoking history \(i = 1, 2, 3\), type of test \(j = 1, 2, 3\), and replicate \(k = 1, 2, 3\). So, for example, \(X_{111} = 12.8 , X_{112} = 13.5\), and so on. Let's assume the \(X_{ijk}\) are mutually independent normal random variables with common variance \(\sigma^2\) and mean:

\(\mu_{ij}=\mu+\alpha_i+\beta_j+\gamma_{ij}\)

subject to the following constraints:

\(\sum\limits_{i=1}^a \alpha_i=0\), \(\sum\limits_{j=1}^b \beta_j=0\), \(\sum\limits_{i=1}^a \gamma_{ij}=0\), and \(\sum\limits_{j=1}^b \gamma_{ij}=0\)

In that case, testing whether or not there is an interaction between smoking history and the type of stress test involves testing the null hypothesis:

\(H_0:\gamma_{ij}=0,for\quad i=1,2,3, and \quad j=1,2,3\)

against all of the possible alternatives. We'll definitely want to engage Minitab in conducting the necessary analysis of variance! To do so, we first enter the data into a Minitab worksheet in an unstacked manner. We then do the following:

  1. Under the Stat menu, we select ANOVA, and then Balanced ANOVA... (our data are "balanced" because every cell contains the same number of measurements, 3).

  2. In the pop-up window that appears, we specify the Response and the Model:

    minitab

    You might want to take particular note of the way we specify the interaction between smoking status and the type of test in Minitab, namely, as Smoker*Test.

  3. We select OK, and the resulting output appears in the Session Window.

Here's what the output looks like with the row pertaining to the interaction term highlighted in yellow:

ANOVA, Time versus Smoker, Test
Factor Type Levels Values
Smoker fixed 3 1, 2, 3
Test fixed 3 1, 2, 3
Analysis of Variance for Time
Source DF  SS  MS 
Smoker 84.899  42.449  12.90  0.000
Test 298.072  149.036  45.28  0.000
Smoker*Test 2.815  0.704  0.21  0.927
Error 18  59.247  3.291     
Total 26  445.032       

S = 1.81424       R-Sq = 86.69%     R-Sq (adj) = 80.77%

As you can see, the P-value, 0.927, is very large. We do not reject the null hypothesis that the interaction terms are all zero. That is, there is insufficient evidence at the 0.05 level to conclude that there is an interaction between smoking history and the type of stress test.

Now, testing whether or not smoking history has an effect on the timing of maximum oxygen uptake involves testing the null hypothesis:

\(H_0:\alpha_1=\alpha_2=\alpha_3=0\)

against all of the possible alternatives. Here's what the output looks like with the row pertaining to the smoking history term highlighted in yellow:

ANOVA, Time versus Smoker, Test
Factor Type Levels Values
Smoker fixed 3 1, 2, 3
Test fixed 3 1, 2, 3
Analysis of Variance for Time
Source DF  SS  MS 
Smoker 84.899  42.449  12.90  0.000
Test 298.072  149.036  45.28  0.000
Smoker*Test 2.815  0.704  0.21  0.927
Error 18  59.247  3.291     
Total 26  445.032       

S = 1.81424       R-Sq = 86.69%     R-Sq (adj) = 80.77%

As you can see, the P-value is very small (< 0.001). We reject the null hypothesis that the smoking history parameters are all zero. That is, there is sufficient evidence at the 0.05 level to conclude that smoking history has an effect on the timing of maximum oxygen uptake.

Now, testing whether or not the type of stress test has an effect on the timing of maximum oxygen uptake involves testing the null hypothesis:

\(H_0:\beta_1=\beta_2=\beta_3=0\)

against all of the possible alternatives. Here's what the output looks like with the row pertaining to the type of stress test term highlighted in yellow:

ANOVA, Time versus Smoker, Test
Factor Type Levels Values
Smoker fixed 3 1, 2, 3
Test fixed 3 1, 2, 3
Analysis of Variance for Time
Source DF  SS  MS 
Smoker 84.899  42.449  12.90  0.000
Test 298.072  149.036  45.28  0.000
Smoker*Test 2.815  0.704  0.21  0.927
Error 18  59.247  3.291     
Total 26  445.032       

S = 1.81424       R-Sq = 86.69%     R-Sq (adj) = 80.77%

As you can see, again, the P-value is very small (< 0.001). We reject the null hypothesis that the stress test parameters are all zero. That is, there is sufficient evidence at the 0.05 level to conclude that the type of stress test has an effect on the timing of maximum oxygen uptake.

In summary, based on these data, the physiologist can conclude that there appears to be an effect due to smoking history and the type of stress test, but that the data do not suggest that the two factors interact in any way.

Note! Section

We were able to include an interaction term in our model in the previous example, because we had multiple observations (three, to be exact) falling in each of the cells. That is, if there is only one observation in each cell, we cannot include an interaction term in our model.