1.3 R Code for Two Examples in Lessons 1.1 and 1.21.3 R Code for Two Examples in Lessons 1.1 and 1.2
One example in Lesson 1.1 and Lesson 1.2 concerned the annual number of earthquakes worldwide with a magnitude greater than 7.0 on the seismic scale. We identified an AR(1) model (autoregressive model of order 1), estimated the model, and assessed the residuals. Below is R code that will accomplish these tasks. The first line reads the data from a file named quakes.dat. The data are listed in time order from left to right in the lines of the file. If you were to download the file, you should download it into a folder that you create for storing course data. Then in R, change the working directory to be this folder.
The commands below include explanatory comments, following the #. Those comments do not have to be entered for the command to work.
x=scan("quakes.dat") x=ts(x) #this makes sure R knows that x is a time series plot(x, type="b") #time series plot of x with points marked as "o" install.packages("astsa") library(astsa) # See note 1 below lag1.plot(x,1) # Plots x versus lag 1 of x. acf(x, xlim=c(1,19)) # Plots the ACF of x for lags 1 to 19 xlag1=lag(x,-1) # Creates a lag 1 of x variable. See note 2 y=cbind(x,xlag1) # See note 3 below ar1fit=lm(y[,1]~y[,2])#Does regression, stores results object named ar1fit summary(ar1fit) # This lists the regression results plot(ar1fit$fit,ar1fit$residuals) #plot of residuals versus fits acf(ar1fit$residuals, xlim=c(1,18)) # ACF of the residuals for lags 1 to 18
- The astsa library accesses R script(s) written by one of the authors of our textbook (Stoffer). In our program, the lag1.plot command is part of that script. You may read more about the library on the website for our text: ASTSA R package. You must install the astsa package in R before loading the commands in the library statement. Not all available packages are included when you install R on your machine (Cran R Project Web Packages). You only need to run install.packages("astsa") once. In subsequent sessions, the library command alone will bring the commands into your current session.
- Note the negative value for the lag in xlag1=lag(x,−1) To lag back in time in R, use a negative lag.
- This is a bit tricky. For whatever reason, R has to bind together a variable with its lags for the lags to be in the proper connection with the original variable. The cbind and the ts.intersect commands both accomplish this task. In the code above, the lagged variable and the original variable become the first and second columns of a matrix named y. The regression command (lm) uses these two columns of y as the response and predictor variables in the regression.
If a command that includes quotation marks doesn’t work when you copy and paste from course notes to R, try typing the command in R instead.
The results, as given by R, follow.
The time series plot for the quakes series.
Plot of x versus lag 1 of x.
The ACF of the quakes series.
The regression results:
|Estimate||Std. Error||t value||Pr(>|t|)|
|y[, 2]||0.54339||0.08528||6.372||6.47e-09 ***|
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 6.122 on 96 degrees of freedom
(2 observations deleted due to missingness)
Multiple R-squared: 0.2972, Adjusted R-squared: 0.2899
Plot of residuals versus fits
ACF of residuals
An example in Lesson 1.2 for this week concerned the weekly cardiovascular mortality rate in Los Angeles County. We used a first difference to account for a linear trend and determine that the first differences may have an AR(1) model.
Download the data: cmort.dat
Following are R commands for the analysis. Again, the commands are commented using #comment.
mort=scan("cmort.dat") plot(mort, type="o") # plot of mortality rate mort=ts(mort) mortdiff=diff(mort,1) # creates a variable = x(t) – x(t-1) plot(mortdiff,type="o") # plot of first differences acf(mortdiff,xlim=c(1,24)) # plot of first differences, for 24 lags mortdifflag1=lag(mortdiff,-1) y=cbind(mortdiff,mortdifflag1) # bind first differences and lagged first differences mortdiffar1=lm(y[,1]~y[,2]) # AR(1) regression for first differences summary(mortdiffar1) # regression results acf(mortdiffar1$residuals, xlim = c(1,24)) # ACF of residuals for 24 lags.
We’ll leave it to you to try the code and see the output, if you wish.