T.2.5.1 - ARIMA Models

Since many of the time series models have a regression structure, it is beneficial to introduce a general class of time series models called autoregressive integrated moving averages or ARIMA models. They are also referred to as Box-Jenkins models, due to the systematic methodology of identifying, fitting, checking, and utilizing ARIMA models, which was popularized by George Box and Gwilym Jenkins in 1976. Before we write down a general ARIMA model, we need to introduce a few additional concepts.

Suppose we have the time series \(Y_{1},Y_{2},\ldots,Y_{t}\). If the value for each Y is determined exactly by a mathematical formula, then the series is said to be deterministic. If the future values of Y can only be described through their probability distribution, then the series is said to be a stochastic process.¹ A special class of stochastic processes is a stationary stochastic process, which occurs when the probability distribution for the process is the same for all starting values of t. Such a process is completely defined by its mean, variance, and autocorrelation function. When a time series exhibits nonstationary behavior, then part of our objective will be to transform it into a stationary process.

When stationarity is not an issue, then we can define an autoregressive moving average or ARMA model as follows:

\(\begin{equation*} Y_{t}=\sum_{i=1}^{p}\phi_{i}Y_{t-i}+a_{t}-\sum_{j=1}^{q}\theta_{j}a_{t-j}, \end{equation*} \)

where \(\phi_{1},\ldots,\phi_{p}\) are the autoregressive parameters to be estimated, \(\theta_{1},\ldots,\theta_{q}\) are the moving average parameters to be estimated, and \(a_{1},\ldots,a_{t}\) are a series of unknown random errors (or residuals) that are assumed to follow a normal distribution. This is also referred to as an ARMA(p,q) model. The model can be simplified by introducing the Box-Jenkins backshift operator, which is defined by the following relationship:

\(\begin{equation*} B^{p}X_{t}=X_{t-p}, \end{equation*}\)

such that \(X_{1},\ldots,X_{t}\) is any time series and \(p<t\). Using the backshift notation yields the following:

\(\begin{equation*} \biggl(1-\sum_{i=1}^{p}\phi_{i}B^{i}\biggr)Y_{t}=\biggl(1-\sum_{j=1}^{q}\theta_{j}B^{j}\biggr)a_{t}, \end{equation*}\)

which is often reduced further to

\(\begin{equation*} \phi_{p}(B)Y_{t}=\theta_{q}(B)a_{t}, \end{equation*}\)

where \(\phi_{p}(B)=(1-\sum_{i=1}^{p}\phi_{i}B^{i})\) and \(\theta_{q}(B)=(1-\sum_{j=1}^{q}\theta_{j}B^{j})\).

When time series exhibit nonstationary behavior (which commonly occurs in practice), then the ARMA model presented above can be extended and written using differences, which are defined as follows:

\(\begin{align*} W_{t}&=Y_{t}-Y_{t-1}=(1-B)Y_{t}\\ W_{t}-W_{t-1}&=Y_{t}-2Y_{t-1}+Y_{t-2}\\ &=(1-B)^{2}Y_{t}\\ & \ \ \vdots\\ W_{t}-\sum_{k=1}^{d}W_{t-k}&=(1-B)^{d}Y_{t}, \end{align*}\)

where d is the order of differencing. Replacing \(Y_{t}\) in the ARMA model with the differences defined above yields the formal ARIMA(p,d,q) model:

\(\begin{equation*} \phi_{p}(B)(1-B)^{d}Y_{t}=\theta_{q}(B)a_{t}. \end{equation*}\)

An alternative way to deal with nonstationary behavior is to simply fit a linear trend to the time series and then fit a Box-Jenkins model to the residuals from the linear fit. This will provide a different fit and a different interpretation, but is still a valid way to approach a process exhibiting nonstationarity.

Seasonal time series can also be incorporated into a Box-Jenkins framework. The following general model is usually recommended:

\(\begin{equation*} \phi_{p}(B)\Phi(P)(B^{s})(1-B)^{d}(1-B^{s})^{D}Y_{t}=\theta_{q}(B)\Theta_{Q}(B^{s})a_{t}, \end{equation*}\)

where p, d, and q are as defined earlier, s is a (known) number of seasons per timeframe (e.g., years), D is the order of the seasonal differencing, and P and Q are the autoregressive and moving average orders, respectively, when accounting for the seasonal shift. Moreover, the operator polynomials \(\phi_{p}(B)\) and \(\theta_{q}(B)\) are as defined earlier, while \(\Phi_{P}(B)=(1-\sum_{i=1}^{P}\Phi_{i}B^{s\times i})\) and \(\Theta_{Q}(B)=(1-\sum_{j=1}^{Q}\Theta_{j}B^{s\times j})\). Luckily, the maximum value of p, d, q, P, D, and Q is usually 2, so the resulting expression is relatively simple.

With all of the modeling discussion provided above, we will now provide a brief overview of how to implement the Box-Jenkins methodology in practice, which (fortunately) most statistical software packages will perform for you:

Model Identification: Using plots of the data, autocorrelations, partial autocorrelations, and other information, a class of simple ARIMA models is selected. This amounts to estimating (or guesstimating) an appropriate value for d followed by estimates for p and q (as well as P, D, and Q in the seasonal time series setting).
Model Estimation: The autoregressive and moving average parameters are found via an optimization method like maximum likelihood.
Diagnostic Checking: The fitted model is checked for inadequacies by studying the autocorrelations of the residual series (i.e., the time-ordered residuals).

The above is then iterated until there appears to be minimal to no improvement in the fitted model.

1 It should be noted that stochastic processes are itself a heavily-studied and very important statistical subject.