# Software Help 4

Software Help 4

The next two pages cover the Minitab and R commands for the procedures in this lesson.

Below is a zip file that contains all the data sets used in this lesson:

STAT501_Lesson04.zip

• alcoholarm.txt
• alcoholtobacco.txt
• alligator.txt
• alphapluto.txt
• anscombe.txt
• bloodpress.txt
• bluegills.txt
• carstopping.txt
• corrosion.txt
• handheight.txt
• incomebirth.txt
• realestate_sales.txt
• residuals.txt
• skincancer.txt
• solutions_conc.txt

# Minitab Help 4: SLR Model Assumptions

Minitab Help 4: SLR Model Assumptions

# R Help 4: SLR Model Assumptions

R Help 4: SLR Model Assumptions

## R

### Alcohol consumption and muscle strength

• Fit a simple linear regression model with y = strength and x = alcohol.
• Display model results.
• Display a scatterplot of the data with the simple linear regression line.
• Display a residual plot with fitted values on the horizontal axis.
• Display a residual plot with x = alcohol on the horizontal axis.
alcoholarm <- read.table("~/path-to-folder/alcoholarm.txt", header=T)
attach(alcoholarm)

model <- lm(strength ~ alcohol)
summary(model)
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept) 26.36954    1.20273  21.925  < 2e-16 ***
# alcohol     -0.29587    0.05105  -5.796 5.14e-07 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 3.874 on 48 degrees of freedom
# Multiple R-squared:  0.4117,  Adjusted R-squared:  0.3994
# F-statistic: 33.59 on 1 and 48 DF,  p-value: 5.136e-07

plot(x=alcohol, y=strength,
xlab="Lifetime consumption of alcohol", ylab="Deltoid muscle strength",
panel.last = lines(sort(alcohol), fitted(model)[order(alcohol)]))

plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))

plot(x=alcohol, y=residuals(model),
panel.last = abline(h=0, lty=2))

detach(alcoholarm)

### Blood pressure

• Fit a simple linear regression model with y = BP and x = Age, display model results, and display a scatterplot of the data with the simple linear regression line.
• Fit a simple linear regression model with y = BP and x = Weight, display model results, and display a scatterplot of the data with the simple linear regression line.
• Fit a simple linear regression model with y = BP and x = Duration, display model results, and display a scatterplot of the data with the simple linear regression line.
• Display a residual plot for the model using x = Age with Weight on the horizontal axis.
• Fit a multiple linear regression model with y = BP, x1 = Age, and x2 = Weight.
• Display a residual plot for the model using x1 = Age and x2 = Weight with Duration on the horizontal axis.
bloodpress <- read.table("~/path-to-folder/bloodpress.txt", header=T)
attach(bloodpress)

model.1 <- lm(BP ~ Age)
summary(model.1)
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)  44.4545    18.7277   2.374  0.02894 *
# Age           1.4310     0.3849   3.718  0.00157 **
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 4.195 on 18 degrees of freedom
# Multiple R-squared:  0.4344,  Adjusted R-squared:  0.403
# F-statistic: 13.82 on 1 and 18 DF,  p-value: 0.001574
plot(x=Age, y=BP,
xlab="Age (years)", ylab="Diastolic blood pressure (mm Hg)",
panel.last = lines(sort(Age), fitted(model.1)[order(Age)]))

model.2 <- lm(BP ~ Weight)
summary(model.2)
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)  2.20531    8.66333   0.255    0.802
# Weight       1.20093    0.09297  12.917 1.53e-10 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 1.74 on 18 degrees of freedom
# Multiple R-squared:  0.9026,  Adjusted R-squared:  0.8972
# F-statistic: 166.9 on 1 and 18 DF,  p-value: 1.528e-10
plot(x=Weight, y=BP,
xlab="Weight (pounds)", ylab="Diastolic blood pressure (mm Hg)",
panel.last = lines(sort(Weight), fitted(model.2)[order(Weight)]))

model.3 <- lm(BP ~ Dur)
summary(model.3)
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept) 109.2350     3.8563  28.327   <2e-16 ***
# Dur           0.7411     0.5703   1.299     0.21
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 5.333 on 18 degrees of freedom
# Multiple R-squared:  0.08575,  Adjusted R-squared:  0.03496
# F-statistic: 1.688 on 1 and 18 DF,  p-value: 0.2102
plot(x=Dur, y=BP,
xlab="Duration of hypertension (years)",
ylab="Diastolic blood pressure (mm Hg)",
panel.last = lines(sort(Dur), fitted(model.3)[order(Dur)]))

plot(x=Weight, y=residuals(model.1),
xlab="Weight (pounds)", ylab="Residuals from model with Age",
panel.last = abline(h=0, lty=2))

model.12 <- lm(BP ~ Age + Weight)

plot(x=Dur, y=residuals(model.12),
xlab="Duration of hypertension (years)",
ylab="Residuals from model with Age and Weight",
panel.last = abline(h=0, lty=2))

detach(bloodpress)

• Fit a simple linear regression model with y = groove and x = mileage.
• Display model results.
• Display a scatterplot of the data with the simple linear regression line.
• Display a residual plot with fitted values on the horizontal axis.
treadwear <- read.table("~/path-to-folder/treadwear.txt", header=T)

model <- lm(groove ~ mileage)
summary(model)
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept) 360.6367    11.6886   30.85 9.70e-09 ***
# mileage      -7.2806     0.6138  -11.86 6.87e-06 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 19.02 on 7 degrees of freedom
# Multiple R-squared:  0.9526,  Adjusted R-squared:  0.9458
# F-statistic: 140.7 on 1 and 7 DF,  p-value: 6.871e-06

plot(x=mileage, y=groove,
xlab="Mileage (1000s of miles)", ylab="Depth of groove (mils)",
panel.last = lines(sort(mileage), fitted(model)[order(mileage)]))

plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))

detach(treadwear)

### Plutonium

• Fit a simple linear regression model with y = alpha and x = pluto.
• Display model results.
• Display a scatterplot of the data with the simple linear regression line.
• Display a residual plot with fitted values on the horizontal axis.
alphapluto <- read.table("~/path-to-folder/alphapluto.txt", header=T)
attach(alphapluto)

model <- lm(alpha ~ pluto)
summary(model)
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.0070331  0.0035988   1.954   0.0641 .
# pluto       0.0055370  0.0003659  15.133 9.08e-13 ***
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 0.01257 on 21 degrees of freedom
# Multiple R-squared:  0.916,  Adjusted R-squared:  0.912
# F-statistic:   229 on 1 and 21 DF,  p-value: 9.077e-13

plot(x=pluto, y=alpha,
xlab="Plutonium activity (pCi/g)", ylab="Alpha count rate (number per second)",
panel.last = lines(sort(pluto), fitted(model)[order(pluto)]))

plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))

detach(alphapluto)

### Alcohol and tobacco

• Fit a simple linear regression model with y = Alcohol and x = Tobacco.
• Display model results.
• Display a scatterplot of the data with the simple linear regression line.
• Display a residual plot with fitted values on the horizontal axis.
• Refit the model excluding Northern Ireland.
• Display a scatterplot of the data excluding Northern Ireland with the simple linear regression line for the model excluding Northern Ireland.
• Display a standardized residual plot for the model fit to all the data with fitted values on the horizontal axis.
• Calculate the standardized residual for Northern Ireland.
alcoholtobacco <- read.table("~/path-to-folder/alcoholtobacco.txt", header=T)
attach(alcoholtobacco)

model.1 <- lm(Alcohol ~ Tobacco)
summary(model.1)
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   4.3512     1.6067   2.708   0.0241 *
# Tobacco       0.3019     0.4388   0.688   0.5087
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 0.8196 on 9 degrees of freedom
# Multiple R-squared:  0.04998,  Adjusted R-squared:  -0.05557
# F-statistic: 0.4735 on 1 and 9 DF,  p-value: 0.5087

plot(x=Tobacco, y=Alcohol,
xlab="Ave weekly tobacco expenditure (GBP)",
ylab="Ave weekly alcohol expenditure (GBP)",
panel.last = lines(sort(Tobacco), fitted(model.1)[order(Tobacco)]))

plot(x=fitted(model.1), y=residuals(model.1),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))

model.2 <- lm(Alcohol ~ Tobacco, subset=Region!="NorthernIreland")

plot(x=Tobacco[Region!="NorthernIreland"], y=Alcohol[Region!="NorthernIreland"],
xlab="Ave weekly tobacco expenditure (GBP)",
ylab="Ave weekly alcohol expenditure (GBP)",
panel.last = lines(sort(Tobacco), fitted(model.2)[order(Tobacco)]))

plot(x=fitted(model.1), y=rstandard(model.1),
xlab="Fitted values", ylab="Standardized residuals",
panel.last = abline(h=0, lty=2))

rstandard(model.1)[Region=="NorthernIreland"] # -2.575075

detach(alcoholtobacco)

### Anscombe data

• Fit a simple linear regression model with y = y3 and x = x3.
• Display model results.
• Display a scatterplot of the data with the simple linear regression line.
• Display a residual plot with fitted values on the horizontal axis.
anscombe <- read.table("~/path-to-folder/anscombe.txt", header=T)
attach(anscombe)

model <- lm(y3 ~ x3)
summary(model)
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)
# (Intercept)   3.0025     1.1245   2.670  0.02562 *
# x3            0.4997     0.1179   4.239  0.00218 **
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 1.236 on 9 degrees of freedom
# Multiple R-squared:  0.6663,  Adjusted R-squared:  0.6292
# F-statistic: 17.97 on 1 and 9 DF,  p-value: 0.002176

plot(x=x3, y=y3,
panel.last = lines(sort(x3), fitted(model)[order(x3)]))

plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))

detach(anscombe)

### Skin cancer mortality

• Load the skin cancer data.
• Fit a simple linear regression model with y = Mort and x = Lat.
• Display a scatterplot of the data with the simple linear regression line.
attach(skincancer)

model <- lm(Mort ~ Lat)

plot(x=Lat, y=Mort,
xlab="Latitude (at center of state)", ylab="Mortality (deaths per 10 million)",
main="Skin Cancer Mortality versus State Latitude",
panel.last = lines(sort(Lat), fitted(model)[order(Lat)]))

detach(skincancer)

### Alligators

• Fit a simple linear regression model with y = weight and x = length.
• Display a scatterplot of the data with the simple linear regression line.
alligator <- read.table("~/path-to-folder/alligator.txt", header=T)
attach(alligator)

model <- lm(weight ~ length)

plot(x=length, y=weight, ylim=c(-50, 650),
panel.last = lines(sort(length), fitted(model)[order(length)]))

detach(alligator)

### Alloy corrosion

• Fit a simple linear regression model with y = wgtloss and x = iron.
• Display a scatterplot of the data with the simple linear regression line.
corrosion <- read.table("~/path-to-folder/corrosion.txt", header=T)
attach(corrosion)

model <- lm(wgtloss ~ iron)

plot(x=iron, y=wgtloss,
panel.last = lines(sort(iron), fitted(model)[order(iron)]))

detach(corrosion)

### Handcode and height

• Fit a simple linear regression model with y = HandSpan and x = Height.
• Display a residual plot with fitted values on the horizontal axis.
handheight <- read.table("~/path-to-folder/handheight.txt", header=T)
attach(handheight)

model <- lm(HandSpan ~ Height)

plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))

detach(handheight)

### Chemical solution concentration

• Fit a simple linear regression model with y = y (concentration) and x = x (time).
• Display a residual plot with fitted values on the horizontal axis.
solconc <- read.table("~/path-to-folder/solutions_conc.txt", header=T)
attach(solconc)

model <- lm(y ~ x)

plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))

detach(solconc)

### Real estate sales

• Fit a simple linear regression model with y = SalePrice and x = Sqrfeet.
• Display a residual plot with fitted values on the horizontal axis.
realestate <- read.table("~/path-to-folder/realestate_sales.txt", header=T)
attach(realestate)

model <- lm(SalePrice ~ SqrFeet)

plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))

detach(realestate)

### Old Faithful geyser eruptions

• Fit a simple linear regression model with y = waiting and x = eruption.
• Display a histogram and normal probability plot of the residuals.
oldfaithful <- read.table("~/path-to-folder/oldfaithful.txt", header=T)
attach(oldfaithful)

model <- lm(waiting ~ eruption)

hist(residuals(model), main="", breaks=12)

qqnorm(residuals(model), main="")
qqline(residuals(model))

detach(oldfaithful)

### Hospital infection risk

• Select only hospitals in regions 1 or 2.
• Fit a simple linear regression model with y = InfctRsk and x = Stay.
• Display a normal probability plot of the residuals.
infectionrisk <- read.table("~/path-to-folder/infectionrisk.txt", header=T)
infectionrisk <- infectionrisk[infectionrisk$Region==1 | infectionrisk$Region==2, ]
attach(infectionrisk)

model <- lm(InfctRsk ~ Stay)

qqnorm(residuals(model), main="")
qqline(residuals(model))

detach(infectionrisk)

### Car stopping distances

• Fit a simple linear regression model with y = StopDist and x = Speed.
• Display a scatterplot of the data with the simple linear regression line.
• Display a residual plot with fitted values on the horizontal axis.
• Create a new response variable equal to √StopDist.
• Fit a simple linear regression model with y = √StopDist and x = Speed.
• Display a scatterplot of the data with the simple linear regression line.
• Display a residual plot with fitted values on the horizontal axis.
• Use the model to predict StopDist for Speed = 10, 20, 30, and 40.
carstopping <- read.table("~/path-to-folder/carstopping.txt", header=T)
attach(carstopping)

model <- lm(StopDist ~ Speed)
plot(x=Speed, y=StopDist,
panel.last = lines(sort(Speed), fitted(model)[order(Speed)]))
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))

sqrtdist <- sqrt(StopDist)

model <- lm(sqrtdist ~ Speed)
plot(x=Speed, y=sqrtdist,
panel.last = lines(sort(Speed), fitted(model)[order(Speed)]))
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))

predict(model, interval="prediction",
newdata=data.frame(Speed=c(10, 20, 30, 40)))^2
#         fit      lwr       upr
# 1  11.86090  3.93973  24.03997
# 2  35.63671 20.42935  55.04771
# 3  72.17067 49.44080  99.18664
# 4 121.46277 90.63292 156.79793

detach(carstopping)

  Link ↥ Has Tooltip/Popover Toggleable Visibility