Alcohol consumption and muscle strength
- Load the alcoholarm data.
- Fit a simple linear regression model with y = strength and x = alcohol.
- Display model results.
- Display a scatterplot of the data with the simple linear regression line.
- Display a residual plot with fitted values on the horizontal axis.
- Display a residual plot with x = alcohol on the horizontal axis.
alcoholarm <- read.table("~/path-to-folder/alcoholarm.txt", header=T)
attach(alcoholarm)
model <- lm(strength ~ alcohol)
summary(model)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 26.36954 1.20273 21.925 < 2e-16 ***
# alcohol -0.29587 0.05105 -5.796 5.14e-07 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 3.874 on 48 degrees of freedom
# Multiple R-squared: 0.4117, Adjusted R-squared: 0.3994
# F-statistic: 33.59 on 1 and 48 DF, p-value: 5.136e-07
plot(x=alcohol, y=strength,
xlab="Lifetime consumption of alcohol", ylab="Deltoid muscle strength",
panel.last = lines(sort(alcohol), fitted(model)[order(alcohol)]))
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))
plot(x=alcohol, y=residuals(model),
xlab="Lifetime consumption of alcohol", ylab="Residuals",
panel.last = abline(h=0, lty=2))
detach(alcoholarm)
Blood pressure
- Load the bloodpress data.
- Fit a simple linear regression model with y = BP and x = Age, display model results, and display a scatterplot of the data with the simple linear regression line.
- Fit a simple linear regression model with y = BP and x = Weight, display model results, and display a scatterplot of the data with the simple linear regression line.
- Fit a simple linear regression model with y = BP and x = Duration, display model results, and display a scatterplot of the data with the simple linear regression line.
- Display a residual plot for the model using x = Age with Weight on the horizontal axis.
- Fit a multiple linear regression model with y = BP, x1 = Age, and x2 = Weight.
- Display a residual plot for the model using x1 = Age and x2 = Weight with Duration on the horizontal axis.
bloodpress <- read.table("~/path-to-folder/bloodpress.txt", header=T)
attach(bloodpress)
model.1 <- lm(BP ~ Age)
summary(model.1)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 44.4545 18.7277 2.374 0.02894 *
# Age 1.4310 0.3849 3.718 0.00157 **
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 4.195 on 18 degrees of freedom
# Multiple R-squared: 0.4344, Adjusted R-squared: 0.403
# F-statistic: 13.82 on 1 and 18 DF, p-value: 0.001574
plot(x=Age, y=BP,
xlab="Age (years)", ylab="Diastolic blood pressure (mm Hg)",
panel.last = lines(sort(Age), fitted(model.1)[order(Age)]))
model.2 <- lm(BP ~ Weight)
summary(model.2)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 2.20531 8.66333 0.255 0.802
# Weight 1.20093 0.09297 12.917 1.53e-10 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 1.74 on 18 degrees of freedom
# Multiple R-squared: 0.9026, Adjusted R-squared: 0.8972
# F-statistic: 166.9 on 1 and 18 DF, p-value: 1.528e-10
plot(x=Weight, y=BP,
xlab="Weight (pounds)", ylab="Diastolic blood pressure (mm Hg)",
panel.last = lines(sort(Weight), fitted(model.2)[order(Weight)]))
model.3 <- lm(BP ~ Dur)
summary(model.3)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 109.2350 3.8563 28.327 <2e-16 ***
# Dur 0.7411 0.5703 1.299 0.21
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 5.333 on 18 degrees of freedom
# Multiple R-squared: 0.08575, Adjusted R-squared: 0.03496
# F-statistic: 1.688 on 1 and 18 DF, p-value: 0.2102
plot(x=Dur, y=BP,
xlab="Duration of hypertension (years)",
ylab="Diastolic blood pressure (mm Hg)",
panel.last = lines(sort(Dur), fitted(model.3)[order(Dur)]))
plot(x=Weight, y=residuals(model.1),
xlab="Weight (pounds)", ylab="Residuals from model with Age",
panel.last = abline(h=0, lty=2))
model.12 <- lm(BP ~ Age + Weight)
plot(x=Dur, y=residuals(model.12),
xlab="Duration of hypertension (years)",
ylab="Residuals from model with Age and Weight",
panel.last = abline(h=0, lty=2))
detach(bloodpress)
Treadwear
- Load the treadwear data.
- Fit a simple linear regression model with y = groove and x = mileage.
- Display model results.
- Display a scatterplot of the data with the simple linear regression line.
- Display a residual plot with fitted values on the horizontal axis.
treadwear <- read.table("~/path-to-folder/treadwear.txt", header=T)
attach(treadwear)
model <- lm(groove ~ mileage)
summary(model)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 360.6367 11.6886 30.85 9.70e-09 ***
# mileage -7.2806 0.6138 -11.86 6.87e-06 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 19.02 on 7 degrees of freedom
# Multiple R-squared: 0.9526, Adjusted R-squared: 0.9458
# F-statistic: 140.7 on 1 and 7 DF, p-value: 6.871e-06
plot(x=mileage, y=groove,
xlab="Mileage (1000s of miles)", ylab="Depth of groove (mils)",
panel.last = lines(sort(mileage), fitted(model)[order(mileage)]))
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))
detach(treadwear)
Plutonium
- Load the alphapluto data.
- Fit a simple linear regression model with y = alpha and x = pluto.
- Display model results.
- Display a scatterplot of the data with the simple linear regression line.
- Display a residual plot with fitted values on the horizontal axis.
alphapluto <- read.table("~/path-to-folder/alphapluto.txt", header=T)
attach(alphapluto)
model <- lm(alpha ~ pluto)
summary(model)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.0070331 0.0035988 1.954 0.0641 .
# pluto 0.0055370 0.0003659 15.133 9.08e-13 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 0.01257 on 21 degrees of freedom
# Multiple R-squared: 0.916, Adjusted R-squared: 0.912
# F-statistic: 229 on 1 and 21 DF, p-value: 9.077e-13
plot(x=pluto, y=alpha,
xlab="Plutonium activity (pCi/g)", ylab="Alpha count rate (number per second)",
panel.last = lines(sort(pluto), fitted(model)[order(pluto)]))
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))
detach(alphapluto)
Alcohol and tobacco
- Load the alcoholtobacco data.
- Fit a simple linear regression model with y = Alcohol and x = Tobacco.
- Display model results.
- Display a scatterplot of the data with the simple linear regression line.
- Display a residual plot with fitted values on the horizontal axis.
- Refit the model excluding Northern Ireland.
- Display a scatterplot of the data excluding Northern Ireland with the simple linear regression line for the model excluding Northern Ireland.
- Display a standardized residual plot for the model fit to all the data with fitted values on the horizontal axis.
- Calculate the standardized residual for Northern Ireland.
alcoholtobacco <- read.table("~/path-to-folder/alcoholtobacco.txt", header=T)
attach(alcoholtobacco)
model.1 <- lm(Alcohol ~ Tobacco)
summary(model.1)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 4.3512 1.6067 2.708 0.0241 *
# Tobacco 0.3019 0.4388 0.688 0.5087
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 0.8196 on 9 degrees of freedom
# Multiple R-squared: 0.04998, Adjusted R-squared: -0.05557
# F-statistic: 0.4735 on 1 and 9 DF, p-value: 0.5087
plot(x=Tobacco, y=Alcohol,
xlab="Ave weekly tobacco expenditure (GBP)",
ylab="Ave weekly alcohol expenditure (GBP)",
panel.last = lines(sort(Tobacco), fitted(model.1)[order(Tobacco)]))
plot(x=fitted(model.1), y=residuals(model.1),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))
model.2 <- lm(Alcohol ~ Tobacco, subset=Region!="NorthernIreland")
plot(x=Tobacco[Region!="NorthernIreland"], y=Alcohol[Region!="NorthernIreland"],
xlab="Ave weekly tobacco expenditure (GBP)",
ylab="Ave weekly alcohol expenditure (GBP)",
panel.last = lines(sort(Tobacco), fitted(model.2)[order(Tobacco)]))
plot(x=fitted(model.1), y=rstandard(model.1),
xlab="Fitted values", ylab="Standardized residuals",
panel.last = abline(h=0, lty=2))
rstandard(model.1)[Region=="NorthernIreland"] # -2.575075
detach(alcoholtobacco)
Anscombe data
- Load the anscombe data.
- Fit a simple linear regression model with y = y3 and x = x3.
- Display model results.
- Display a scatterplot of the data with the simple linear regression line.
- Display a residual plot with fitted values on the horizontal axis.
anscombe <- read.table("~/path-to-folder/anscombe.txt", header=T)
attach(anscombe)
model <- lm(y3 ~ x3)
summary(model)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 3.0025 1.1245 2.670 0.02562 *
# x3 0.4997 0.1179 4.239 0.00218 **
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 1.236 on 9 degrees of freedom
# Multiple R-squared: 0.6663, Adjusted R-squared: 0.6292
# F-statistic: 17.97 on 1 and 9 DF, p-value: 0.002176
plot(x=x3, y=y3,
panel.last = lines(sort(x3), fitted(model)[order(x3)]))
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))
detach(anscombe)
Skin cancer mortality
- Load the skin cancer data.
- Fit a simple linear regression model with y = Mort and x = Lat.
- Display a scatterplot of the data with the simple linear regression line.
attach(skincancer)
model <- lm(Mort ~ Lat)
plot(x=Lat, y=Mort,
xlab="Latitude (at center of state)", ylab="Mortality (deaths per 10 million)",
main="Skin Cancer Mortality versus State Latitude",
panel.last = lines(sort(Lat), fitted(model)[order(Lat)]))
detach(skincancer)
Alligators
- Load the alligator data.
- Fit a simple linear regression model with y = weight and x = length.
- Display a scatterplot of the data with the simple linear regression line.
alligator <- read.table("~/path-to-folder/alligator.txt", header=T)
attach(alligator)
model <- lm(weight ~ length)
plot(x=length, y=weight, ylim=c(-50, 650),
panel.last = lines(sort(length), fitted(model)[order(length)]))
detach(alligator)
Alloy corrosion
- Load the corrosion data.
- Fit a simple linear regression model with y = wgtloss and x = iron.
- Display a scatterplot of the data with the simple linear regression line.
corrosion <- read.table("~/path-to-folder/corrosion.txt", header=T)
attach(corrosion)
model <- lm(wgtloss ~ iron)
plot(x=iron, y=wgtloss,
panel.last = lines(sort(iron), fitted(model)[order(iron)]))
detach(corrosion)
Hand code and height
- Load the handheight data.
- Fit a simple linear regression model with y = HandSpan and x = Height.
- Display a residual plot with fitted values on the horizontal axis.
handheight <- read.table("~/path-to-folder/handheight.txt", header=T)
attach(handheight)
model <- lm(HandSpan ~ Height)
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))
detach(handheight)
Chemical solution concentration
- Load the solconc data.
- Fit a simple linear regression model with y = y (concentration) and x = x (time).
- Display a residual plot with fitted values on the horizontal axis.
solconc <- read.table("~/path-to-folder/solutions_conc.txt", header=T)
attach(solconc)
model <- lm(y ~ x)
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))
detach(solconc)
Real estate sales
- Load the realestate data.
- Fit a simple linear regression model with y = SalePrice and x = Sqrfeet.
- Display a residual plot with fitted values on the horizontal axis.
realestate <- read.table("~/path-to-folder/realestate_sales.txt", header=T)
attach(realestate)
model <- lm(SalePrice ~ SqrFeet)
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))
detach(realestate)
Old Faithful geyser eruptions
- Load the oldfaithful data.
- Fit a simple linear regression model with y = waiting and x = eruption.
- Display a histogram and normal probability plot of the residuals.
oldfaithful <- read.table("~/path-to-folder/oldfaithful.txt", header=T)
attach(oldfaithful)
model <- lm(waiting ~ eruption)
hist(residuals(model), main="", breaks=12)
qqnorm(residuals(model), main="")
qqline(residuals(model))
detach(oldfaithful)
Hospital infection risk
- Load the infectionrisk data.
- Select only hospitals in regions 1 or 2.
- Fit a simple linear regression model with y = InfctRsk and x = Stay.
- Display a normal probability plot of the residuals.
infectionrisk <- read.table("~/path-to-folder/infectionrisk.txt", header=T)
infectionrisk <- infectionrisk[infectionrisk$Region==1 | infectionrisk$Region==2, ]
attach(infectionrisk)
model <- lm(InfctRsk ~ Stay)
qqnorm(residuals(model), main="")
qqline(residuals(model))
detach(infectionrisk)
Car stopping distances
- Load the carstopping data.
- Fit a simple linear regression model with y = StopDist and x = Speed.
- Display a scatterplot of the data with the simple linear regression line.
- Display a residual plot with fitted values on the horizontal axis.
- Create a new response variable equal to √StopDist.
- Fit a simple linear regression model with y = √StopDist and x = Speed.
- Display a scatterplot of the data with the simple linear regression line.
- Display a residual plot with fitted values on the horizontal axis.
- Use the model to predict StopDist for Speed = 10, 20, 30, and 40.
carstopping <- read.table("~/path-to-folder/carstopping.txt", header=T)
attach(carstopping)
model <- lm(StopDist ~ Speed)
plot(x=Speed, y=StopDist,
panel.last = lines(sort(Speed), fitted(model)[order(Speed)]))
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))
sqrtdist <- sqrt(StopDist)
model <- lm(sqrtdist ~ Speed)
plot(x=Speed, y=sqrtdist,
panel.last = lines(sort(Speed), fitted(model)[order(Speed)]))
plot(x=fitted(model), y=residuals(model),
xlab="Fitted values", ylab="Residuals",
panel.last = abline(h=0, lty=2))
predict(model, interval="prediction",
newdata=data.frame(Speed=c(10, 20, 30, 40)))^2
# fit lwr upr
# 1 11.86090 3.93973 24.03997
# 2 35.63671 20.42935 55.04771
# 3 72.17067 49.44080 99.18664
# 4 121.46277 90.63292 156.79793
detach(carstopping)