1.3 R Code for Two Examples in Lessons 1.1 and 1.2

One example in Lesson 1.1 and Lesson 1.2 concerned the annual number of earthquakes worldwide with a magnitude greater than 7.0 on the seismic scale. We identified an AR(1) model (autoregressive model of order 1), estimated the model, and assessed the residuals. Below is R code that will accomplish these tasks. The first line reads the data from a file named quakes.dat. The data are listed in time order from left to right in the lines of the file. If you were to download the file, you should download it into a folder that you create for storing course data. Then in R, change the working directory to be this folder.

The commands below include explanatory comments, following the #. Those comments do not have to be entered for the command to work.

x=scan("quakes.dat")
x=ts(x) #this makes sure R knows that x is a time series
plot(x, type="b") #time series plot of x with points marked as "o"
install.packages("astsa")
library(astsa) # See note 1 below
lag1.plot(x,1) # Plots x versus lag 1 of x.
acf(x, xlim=c(1,19)) # Plots the ACF of x for lags 1 to 19
xlag1=lag(x,-1) # Creates a lag 1 of x variable. See note 2
y=cbind(x,xlag1) # See note 3 below
ar1fit=lm(y[,1]~y[,2])#Does regression, stores results object named ar1fit
summary(ar1fit) # This lists the regression results
plot(ar1fit$fit,ar1fit$residuals) #plot of residuals versus fits
acf(ar1fit$residuals, xlim=c(1,18)) # ACF of the residuals for lags 1 to 18 

Note!

  1. The astsa library accesses R script(s) written by one of the authors of our textbook (Stoffer). In our program, the lag1.plot command is part of that script. You may read more about the library on the website for our text: ASTSA R package. You must install the astsa package in R before loading the commands in the library statement. Not all available packages are included when you install R on your machine (Cran R Project Web Packages). You only need to run install.packages("astsa") once. In subsequent sessions, the library command alone will bring the commands into your current session.
  2. Note the negative value for the lag in xlag1=lag(x,−1) To lag back in time in R, use a negative lag.
  3. This is a bit tricky. For whatever reason, R has to bind together a variable with its lags for the lags to be in the proper connection with the original variable. The cbind and the ts.intersect commands both accomplish this task. In the code above, the lagged variable and the original variable become the first and second columns of a matrix named y. The regression command (lm) uses these two columns of y as the response and predictor variables in the regression.

General Note! 

If a command that includes quotation marks doesn’t work when you copy and paste from course notes to R, try typing the command in R instead.

The results, as given by R, follow.

The time series plot for the quakes series.

graph

Plot of x versus lag 1 of x.

graph

The ACF of the quakes series.

graph

The regression results:

Coefficients:
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.19070 1.81924 5.052 2.08e-06 ***
y[, 2] 0.54339 0.08528 6.372 6.47e-09 ***

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.122 on 96 degrees of freedom
(2 observations deleted due to missingness)
Multiple R-squared: 0.2972, Adjusted R-squared: 0.2899

Plot of residuals versus fits

graph

ACF of residuals

graph

An example in Lesson 1.2 for this week concerned the weekly cardiovascular mortality rate in Los Angeles County. We used a first difference to account for a linear trend and determine that the first differences may have an AR(1) model.

Download the data: cmort.dat

Following are R commands for the analysis. Again, the commands are commented using #comment.

mort=scan("cmort.dat")
plot(mort, type="o") # plot of mortality rate
mort=ts(mort)
mortdiff=diff(mort,1) # creates a variable = x(t) – x(t-1)
plot(mortdiff,type="o") # plot of first differences
acf(mortdiff,xlim=c(1,24)) # plot of first differences, for 24 lags
mortdifflag1=lag(mortdiff,-1)
y=cbind(mortdiff,mortdifflag1) # bind first differences and lagged first differences
mortdiffar1=lm(y[,1]~y[,2]) # AR(1) regression for first differences
summary(mortdiffar1) # regression results
acf(mortdiffar1$residuals, xlim = c(1,24)) # ACF of residuals for 24 lags. 

We’ll leave it to you to try the code and see the output, if you wish.