This lesson defines the sample autocorrelation function (ACF) in general and derives the pattern of the ACF for an AR(1) model. Recall from Lesson 1.1 for this week that an AR(1) model is a linear model that predicts the present value of a time series using the immediately prior value in time.
Stationary Series
As a preliminary, we define an important concept, that of a stationary series. For an ACF to make sense, the series must be a weakly stationary series. This means that the autocorrelation for any particular lag is the same regardless of where we are in time.
 (Weakly) Stationary Series

A series \(x_t\) is said to be (weakly) stationary if it satisfies the following properties:
 The mean \(E(x_t)\) is the same for all \(t\).
 The variance of \(x_t\) is the same for all \(t\).
 The covariance (and also correlation) between \(x_t\) and \(x_{th}\) is the same for all \(t\) at each lag \(h\) = 1, 2, 3, etc.
 Autocorrelation Function (ACF)

Let \(x_t\) denote the value of a time series at time \(t\). The ACF of the series gives correlations between \(x_t\) and \(x_{th}\) for \(h\) = 1, 2, 3, etc. Theoretically, the autocorrelation between \(x_t\) and \(x_{th}\)_{}equals
\(\dfrac{\text{Covariance}(x_t, x_{th})}{\text{Std.Dev.}(x_t)\text{Std.Dev.}(x_{th})} = \dfrac{\text{Covariance}(x_t, x_{th})}{\text{Variance}(x_t)}\)
The denominator in the second formula occurs because the standard deviation of a stationary series is the same at all times.
The last property of a weakly stationary series says that the theoretical value of autocorrelation of particular lag is the same across the whole series. An interesting property of a stationary series is that theoretically it has the same structure forwards as it does backward.
Many stationary series have recognizable ACF patterns. Most series that we encounter in practice, however, is not stationary. A continual upward trend, for example, is a violation of the requirement that the mean is the same for all \(t\). Distinct seasonal patterns also violate that requirement. The strategies for dealing with nonstationary series will unfold during the first three weeks of the semester.
The Firstorder Autoregression Model
We’ll now look at theoretical properties of the AR(1) model. Recall from Lesson 1.1, that the 1^{st }order autoregression model is denoted as AR(1). In this model, the value of \(x\) at time \(t\) is a linear function of the value of \(x\) at time \(t1\). The algebraic expression of the model is as follows:
\(x_t = \delta + \phi_1x_{t1}+w_t\)
Assumptions
 \(w_t \overset{iid}{\sim} N(0, \sigma^2_w)\), meaning that the errors are independently distributed with a normal distribution that has mean 0 and constant variance.
 Properties of the errors \(w_t\) are independent of \(x_t\).
 The series \(x_1\), \(x_2\), ... is (weakly) stationary. A requirement for a stationary AR(1) is that \(\phi_1 < 1\). We’ll see why below.
Properties of the AR(1)
Formulas for the mean, variance, and ACF for a time series process with an AR(1) model follow.
 The (theoretical) mean of \(x_t\) is
\(E(x_t)=\mu = \dfrac{\delta}{1\phi_1}\)
 The variance of \(x_t\) is
\(\text{Var}(x_t) = \dfrac{\sigma^2_w}{1\phi_1^2}\)
 The correlation between observations \(h\) time periods apart is
\(\rho_h = \phi^h_1\)
This defines the theoretical ACF for a time series variable with an AR(1) model.
\(\phi_1\) is the slope in the AR(1) model and we now see that it is also the lag 1 autocorrelation.
Details of the derivations of these properties are in the Appendix to this lesson for interested students.
Pattern of ACF for AR(1) Model
The ACF property defines a distinct pattern for the autocorrelations. For a positive value of \(\phi_1\), the ACF exponentially decreases to 0 as the lag \(h\) increases. For negative \(\phi_1\), the ACF also exponentially decays to 0 as the lag increases, but the algebraic signs for the autocorrelations alternate between positive and negative.
Following is the ACF of an AR(1) with \(\phi_1\)= 0.6, for the first 12 lags.
The tapering pattern:
The ACF of an AR(1) with \(\phi_1\) = −0.7 follows.
The alternating and tapering pattern.
Example 13 Section
In Example 1 of Lesson 1.1, we used an AR(1) model for annual earthquakes in the world with seismic magnitude greater than 7. Here’s the sample ACF of the series:
Lag.  ACF 

1.  0.541733 
2.  0.418884 
3.  0.397955 
4.  0.324047 
5.  0.237164 
6.  0.171794 
7.  0.190228 
8.  0.061202 
9.  0.048505 
10.  0.106730 
11.  0.043271 
12.  0.072305 
The sample autocorrelations taper, although not as fast as they should for an AR(1). For instance, theoretically the lag 2 autocorrelation for an AR(1) = squared value of lag 1 autocorrelation. Here, the observed lag 2 autocorrelation = .418884. That’s somewhat greater than the squared value of the first lag autocorrelation (.541733^{2}= 0.293). But, we managed to do okay (in Lesson 1.1) with an AR(1) model for the data. For instance, the residuals looked okay. This brings up an important point – the sample ACF will rarely fit a perfect theoretical pattern. A lot of the time you just have to try a few models to see what fits.
We’ll study the ACF patterns of other ARIMA models during the next three weeks. Each model has a different pattern for its ACF, but in practice the interpretation of a sample ACF is not always so clearcut.
A reminder: Residuals usually are theoretically assumed to have an ACF that has correlation = 0 for all lags.
Example 14 Section
Here’s a time series of the daily cardiovascular mortality rate in Los Angeles County, 19701979
There is a slight downward trend, so the series may not be stationary. To create a (possibly) stationary series, we’ll examine the first differences \(y_t=x_tx_{t1}\). This is a common time series method for creating a detrended series and thus potentially a stationary series. Think about a straight line – there are constant differences in average \(y\) for each change of 1unit in \(x\).
The time series plot of the first differences is the following:
The following plot is the sample estimate of the autocorrelation function of 1^{st} differences:
Lag.  ACF 

1.  0.506029 
2.  0.205100 
3.  0.126110 
4.  0.062476 
5.  0.015190 
This looks like the pattern of an AR(1) with a negative lag 1 autocorrelation.
The lag 2 correlation is roughly equal to the squared value of the lag 1 correlation. The lag 3 correlation is nearly exactly equal to the cubed value of the lag 1 correlation, and the lag 4 correlation nearly equals the fourth power of the lag 1 correlation. Thus an AR(1) model may be a suitable model for the first differences \(y_t = x_t  x_{t1}\) .
Let \(y_t\) denote the first differences, so that \(y_t = x_t  x_{t1}\) and \(y_{t1} = x_{t1}x_{t2}\). We can write this AR(1) model as
\(y_t = \delta + \phi_1y_{t1}+w_t\)
Using R, we found that the estimated model for the first differences is
\(\widehat{y}_t = 0.046270.50636y_{t1}\)
Some R code for this example will be given in Lesson 1.3 for this week.
Appendix Derivations of Properties of AR(1) Section
Generally you won’t be responsible for reproducing theoretical derivations, but interested students may want to see the derivations for the theoretical properties of an AR(1).
The algebraic expression of the model is as follows:
\(x_t = \delta + \phi_1x_{t1}+w_t\)
Assumptions
 \(w_t \overset{iid}{\sim} N(0, \sigma^2_w)\), meaning that the errors are independently distributed with a normal distribution that has mean 0 and constant variance.
 Properties of the errors \(w_t\) are independent of \(x_t\).
 The series \(x_1\), \(x_2\), ... is (weakly) stationary. A requirement for a stationary AR(1) is that \(\phi_1<1\). We’ll see why below.
Mean
\(E(x_t) = E(\delta + \phi_1x_{t1}+w_t) = E(\delta) + E(\phi_1x_{t1}) + E(w_t) = \delta + \phi_1E(x_{t1}) + 0\)
With the stationary assumption, \(E(x_t) = E(x_{t1})\). Let \(\mu\) denote this common mean. Thus \(\mu = \delta + \phi_1\mu\). Solve for \(\mu\) to get
\(\mu = \dfrac{\delta}{1\phi_1}\)
Variance
By independence of errors and values of \(x\),
\begin{eqnarray}
\text{Var}(x_t) &=& \text{Var}(\delta)+\text{Var}(\phi_1 x_{t1})+\text{Var}(w_t) \nonumber \\
&=& \phi_1^2 \text{Var}(x_{t1})+\sigma^2_w
\end{eqnarray}
By the stationary assumption, \(\text{Var}(x_t) = \text{Var}(x_{t1})\). Substitute \(\text{Var}(x_t)\) for \(\text{Var}(x_{t1})\) and then solve for \(\text{Var}(x_t)\). Because \(\text{Var}(x_t)>0\), it follows that \((1\phi^2_1)>0\) and therefore \(\phi_1<1\).
Autocorrelation Function (ACF)
To start, assume the data have mean 0, which happens when \(\delta=0\), and \(x_t=\phi_1x_{t1}+w_t\). In practice this isn’t necessary, but it simplifies matters. Values of variances, covariances and correlations are not affected by the specific value of the mean.
Let \(y_h = E( x_t x_{t + h }) = E ( x_t x_{t h})\), the covariance observations \(h\) time periods apart (when the mean = 0). Let \(\rho_h\) = correlation between observations that are \(h\) time periods apart.
Covariance and correlation between observations one time period apart
\(\gamma_1 = \text{E}(x_t x_{t+1}) = \text{E}(x_t(\phi_1 x_t + w_{t+1})) = \text{E}(\phi_1 x_t^2 + x_t w_{t+1}) = \phi_1 \text{Var}(x_t)\)
\(\rho_1 = \dfrac{\text{Cov}(x_t, x_{t+1})}{\text{Var}(x_t)} = \dfrac{\phi_1 \text{Var}(x_t)}{\text{Var}(x_t)} = \phi_1\)
Covariance and correlation between observations \(h\) time periods apart
To find the covariance \(\gamma_h\)_{ }, multiply each side of the model for \(x_t\) _{ } by \(x_{th}\)_{ }, then take expectations.
\(x_t = \phi_1x_{t1}+w_t\)
\(x_{th}x_t = \phi_1x_{th}x_{t1}+x_{th}w_t\)
\(E(x_{th}x_t) = E(\phi_1x_{th}x_{t1})+E(x_{th}w_t)\)
\(\gamma_h = \phi_1 \gamma_{h1}\)
If we start at \(\gamma_1\), and move recursively forward we get \(\gamma_h = \phi^h_1 \gamma_0\). By definition, \(\gamma_0 = \text{Var}(x_t)\), so this is \(\gamma_h = \phi^h_1\text{Var}(x_t)\). The correlation
\( \rho_h = \dfrac{\gamma_h}{\text{Var}(x_t)} = \dfrac{\phi_1^h \text{Var}(x_t)}{\text{Var}(x_t)} = \phi_1^h \)