11.2 Vector Autoregressive models VAR(p) models

VAR models (vector autoregressive models) are used for multivariate time series. The structure is that each variable is a linear function of past lags of itself and past lags of the other variables.

As an example suppose that we measure three different time series variables, denoted by \(x_{t,1}\), \(x_{t,2}\), and \(x_{t,3}\).

The vector autoregressive model of order 1, denoted as VAR(1), is as follows:

\(x_{t,1} = \alpha_{1} + \phi_{11} x_{t−1,1} + \phi_{12}x_{t−1,2} + \phi_{13}x_{t−1,3} + w_{t,1}\)

\(x_{t,2} = \alpha_{2} + \phi_{21} x_{t−1,1} + \phi_{22}x_{t−1,2} + \phi_{23}x_{t−1,3} + w_{t,2}\)

\(x_{t,3} = \alpha_{3} + \phi_{31} x_{t−1,1} + \phi_{32}x_{t−1,2} + \phi_{33}x_{t−1,3} + w_{t,3}\)

Each variable is a linear function of the lag 1 values for all variables in the set.

In a VAR(2) model, the lag 2 values for all variables are added to the right sides of the equations, In the case of three x-variables (or time series) there would be six predictors on the right side of each equation, three lag 1 terms and three lag 2 terms.

In general, for a VAR(p) model, the first p lags of each variable in the system would be used as regression predictors for each variable.

VAR models are a specific case of more general VARMA models. VARMA models for multivariate time series include the VAR structure above along with moving average terms for each variable. More generally yet, these are special cases of ARMAX models that allow for the addition of other predictors that are outside the multivariate set of principal interest.

Here, as in Section 5.6 of the text, we’ll focus on VAR models.

On page 304, the authors fit the model of the form

\(\mathbf{x}_t = \Gamma \mathbf{u}_t + \phi \mathbf{x}_{t-1} + \mathbf{w}_t\)

where \(\mathbf{u}_t = (1, t)’\) includes terms to simultaneously fit the constant and trend. It arose from macroeconomic data where large changes in the data permanently affect the level of the series.

There is a not so subtle difference here from previous lessons in that we now are fitting a model to data that need not be stationary. In previous versions of the text, the authors separately de-trended each series using a linear regression with \(t\), the index of time, as the predictor variable. The de-trended values for each of the three series are the residuals from this linear regression on \(t\). The de-trending is useful conceptually because it takes away the common steering force that time may have on each series and created stationarity as we have seen in past lessons. This approach results in similar coefficients, though slightly different as we are now simultaneously fitting the intercept and trend together in a multivariate OLS model.

The R vars library authored by Bernhard Pfaff has the capability to fit this model with trend. Let’s look at 2 examples: a trend-stationary model and a stationary model.

Trend-Stationary Model Section

Example 5.10 from the text is a trend-stationary model in that the de-trended series are stationary. We need not de-trend each series as described above because we can include the trend directly in the VAR model with the VAR command. Let’s examine the code and example from the text by fitting the model above:

install.packages("vars") #If not already installed
install.packages("astsa") #If not already installed
x = cbind(cmort, tempr, part)
plot.ts(x , main = "", xlab = "")
fitvar1= VAR(x, p=1, type="both")
  • The first two commands load the necessary commands from the vars library and the necessary data from our text’s library.
  • The cbind command creates a vector of response variables (a necessary step for multivariate responses).
  • The VAR command does estimation of AR models using ordinary least squares while simultaneously fitting the trend, intercept, and ARIMA model. The p = 1 argument requests an AR(1) structure and “both” fits constant and trend. With the vector of responses, it’s actually a VAR(1).

Following is the output from the VAR command for the variable tempr (the text provides the output for cmort):

Estimation results for equation tempr:


tempr = cmort.11 + tempr.11 + part.11 + const + trend

  Estimate Std. Error t value Pr(>|t|)  
cmort.l1 -0.2440046 0.042105 -5.796 1.20e-08 ***
tempr.l1 0.486596 0.036899 13.187 < 2e-16 ***
part.l1 -0.127661 0.021985 -5.807 1.13e-08 ***
const 67.585598 5.541550 12.196 < 2e-16 ***
trend -0.006912 0.002268 -3.048 0.00243 **

Signif. codes: 0 `***´ 0.001 `**´ 0.01 `*´ 0.05 `.´ 0.1 ` ´ 1

Residual standard error: 6.4 on 502 degrees of freedom
Multiple R-Squared: 0.5007, Adjusted R-squared: 0.4967
F-statistic: 125.9 on 4 and 502 DF, p-value: < 2.2e-16

The coefficients for a variable are listed in the Estimate column. The .l1 attached to each variable name indicates that they are lag 1 variables.

Using notation T = temperature, t=time (collected weekly), M = mortality rate, and P = pollution, the equation for temperature is

\(\widehat{T}_t = 67.586 - .007 t - 0.244 M_{t-1} + 0.487 T_{t-1} - 0.128 P_{t-1}\)

The equation for mortality rate is

\(\widehat{M}_t = 73.227 – 0.014 t + 0.465 M_{t-1} - 0.361 T_{t-1} + 0.099 P_{t-1}\)

The equation for pollution is

\(\widehat{P}_t = 67.464 - .005 t - 0.125 M_{t-1} - 0.477 T_{t-1} +0.581 P_{t-1}.\)

The covariance matrix of the residuals from the VAR(1) for the three variables is printed below the estimation results. The variances are down the diagonal and could possibly be used to compare this model to higher order VARs. The determinant of that matrix is used in the calculation of the BIC statistic that can be used to compare the fit of the model to the fit of other models (see formulas 5.89 and 5.90 of the text).

For further references on this technique see Analysis of integrated and co-integrated time series with R by Pfaff and also Campbell and Perron [1991].

In Example 5.11, the authors give results for a VAR(2) model for the mortality rate data. In R, you may fit the VAR(2) model with the command

summary(VAR(x, p=2, type="both"))

The output, as displayed by the VAR command is as follows:

Estimation results for equation tempr:


tempr = cmort.11 + tempr.11 + part.11 + cmort.12 + tempr.12 + part.12 + const + trend

  Estimate Std. Error t value Pr(>|t|)  
cmort.l1 -0.108889 0.050667 -2.149 0.03211 *
tempr.l1 0.260963 0.051292 5.088 5.14e-07 ***
part.l1 -0.050542 0.027844 -1.815 0.07010 .
cmort.l2 -0.040870 0.048587 -0.81 0.40065  
tempr.l2 0.355592 0.051762 6.870 1.93e-11 ***
part.l2 -0.095114 0.029295 -3.247 0.00125 **
const 49.880485 6.854540 7.277 1.34e-12 ***
trend -0.004754 0.002308 -2.060 0.03993 *

Signif. codes: 0 `***´ 0.001 `**´ 0.01 `*´ 0.05 `.´ 0.1 ` ´ 1

Residual standard error: 6.134 on 498 degrees of freedom
Multiple R-Squared: 0.5445, Adjusted R-squared: 0.5381
F-statistic: 85.04 on 7 and 498 DF, p-value: < 2.2e-16

Again, the coefficients for a particular variable are listed in the Estimate column. As an example, the estimated equation for temperature is

\(\widehat{T}_t = 49.88 - .005 t - 0.109 M_{t-1} + 0.261 T_{t-1} – 0.051 P_{t-1} - 0.041 M_{t-2} + 0.356 T_{t-2} – 0.095 P_{t-2}\)

We will discuss information criterion statistics to compare VAR models of different orders in the homework.

Residuals are also available for analysis. For example, if we assign the VAR command to an object titled fitvar2 in our program,

fitvar2 = VAR(x, p=2, type="both")

then we have access to the matrix residuals(fitvar2). This matrix will have three columns, one column of residuals for each variable.

For example, we might use


to see the ACF of the residuals for mortality rate after fitting the VAR(2) model.

Following is the ACF that resulted from the command just described. It looks good for a residual ACF. (The big spike at the beginning is the unimportant lag 0 correlation.)

  R plot of residuals

The following two commands will create ACFs for the residuals for the other two variables.


They also resemble white noise.

We may also examine these plots in the cross-correlation matrix provided by acf(residuals(fitvar2)):

cross-correlation matrix plots

The plots along the diagonal are the individual ACFs for each model’s residuals that we just discussed above. In addition, we now see the cross-correlation plots of each set of residuals. Ideally, these would also resemble white noise at every lag, including lag 0, however we do see remaining cross-correlations, especially between temperature and pollution. As our authors note, this model does not adequately capture the complete association between these variables in time.

Stationary Model Section

Lets explore an example where the original data are stationary and examine the VAR code by fitting the model above with both a constant and trend. Using R, we simulated n = 500 sample values using the VAR(2) model

\(y_{t,1} = 10 +.25y_{t−1,1} - .20y_{t−1,2} -.40y_{t−2,1} -.65y_{t−2,2}\)

\(y_{t,2} = 20 +.60y_{t−1,1} - .45y_{t−1,2} +.50y_{t−2,1} +.35y_{t−2,2}\)

Using the VAR command explained above:

summary(VAR(cbind(y1,y2), p=2, type="both"))

We obtain the following output:

VAR Estimation Results:


Endogenous variables: y1, y2
Deterministic variables: both Sample size: 498
Log Likelihood: -1396.493
Roots of the characteristic polynomial: 0.9434 0.9434 0.8793 0.1518
VAR (y = cbind(y1, y2), p = 2, type = "both")

Estimation results for equation y1:


y1 = y1.11 + y2.11 + y1.12 + y2.12 + const + trend

  Estimate Std. Error t value Pr(>|t|)  
y1.11 0.2936162 0.0296083 9.917 < 2e-16 ***
y2.11 -0.1899663 0.0372244 -5.103 4.78e-07 ***
y1.12 -0.4534484 0.0374431 -12.110 < 2e-16 ***
y2.12 -0.6357890 0.0243282 -26.134 < 2e-16 ***
const 9.5042440 0.8742211 10.872 < 2e-16 ***
trend 0.0001730 0.0003116 0.555   0.579  

Signif. codes: 0 `***´ 0.001 `**´ 0.01 `*´ 0.05 `.´ 0.1 ` ´ 1

Residual standard error: 0.9969 on 492 degrees of freedom
Multiple R-Squared: 0.8582,   Adjusted R-squared: 0.8568
F-statistic: 595.5 on 5 and 492 DF, p-value: < 2.2e-16

The estimates are very close to the simulated coefficients and the trend is not significant, as expected. For stationary data, when detrending is unnecessary, you may also use the ar.ols command to fit a VAR model:

fitvar2 = ar.ols(cbind(y1, y2), order=2)


  y1 y2
y1 0.2930 -0.1913
y2 0.6506 -0.3998


,  ,  2

  y1 y2
y1 -0.4523 -0.6365
y2 0.4534 0.3847


y1 y2
-0.043637 -0.003306


  y1 y2
y1 0.98244 0.03339
y2 0.03339 0.96656

In the first matrix given, read across a row to get the coefficients for a variable. The preceding commas followed by 1 or 2 indicate whether the coefficients are lag 1 or lag 2 variables respectively. The intercepts of the equations are given under $x.intercept – one intercept per variable.

The matrix under $var.pred gives the variance-covariance matrix of the residuals from the VAR(2) for the two variables. The variances are down the diagonal and could possibly be used to compare this model to higher order VARs as noted above. 

The standard errors of the AR coefficients are given by the fitvar2$asy.se.coef command. The output is:

,  ,  y1

  y1 y2
y1 0.02941649 0.02917778
y2 0.03716747 0.03686586

,  ,  y2

  y1 y2
y1 0.03693359 0.03663388
y2 0.02415371 0.02395770

As with the coefficients, read across rows. The first row gives the standard errors of the coefficients for the lag 1 variables that predict y1. The second row gives the standard errors for the coefficients that predict y2.

You may note that the coefficients are close to the VAR command except the intercept. This is because ar.ols estimates the model for x-mean(x). To match the intercept provided by the summary(VAR(cbind(y1,y2), p=2, type="const")) command, you must calculate the intercept as follows:

$$ (y_{t,1} - \widehat{\mu}_1) = \alpha_1 + \phi_{11} (y_{t−1,1} – \widehat{\mu}_1) + \phi_{12} (y_{t−1,2} – \widehat{\mu}_2) + \phi_{21}(y_{t−2,1} – \widehat{\mu}_1) + \phi_{22}(y_{t−2,2} – \widehat{\mu}_2) + w_{t,1}  $$

$$ \ \ \ \ \ \ \ \ \ \ \ \   y_{t,1} = \alpha_1 + \widehat{\mu}_1 (1- \phi_{11}- \phi_{21}) – \widehat{\mu}_2 (\phi_{12}+\phi_{22}) + \phi_{11}y_{t−1,1} + \phi_{12} y_{t−1,2} + \phi_{21}y_{t−2,1} + \phi_{22}y_{t−2,2} + w_{t,1}  $$

In our example, the intercept for the simulated model for yt,1 equals

-0.043637 -2.733607*(1-0.2930+0.4523) – 15.45479*(-0.1913-0.6365) = 9.580768,

and the estimated equation for yt,1

\(y_{t,1}= 9.58 +.29y_{t−1,1} - .19y_{t−1,2} -.45y_{t−2,1} -.64y_{t−2,2}\)

Difference stationary refers to the situation where differencing is required to obtain stationarity.  If the series is expressed as an AR process and the AR polynomial contains a unit root, that is if one root of the autoregressive polynomial lies on the unit circle, e.g. for an AR(1), \(\phi_1 = 1\), then differencing is necessary. The Dickey-Fuller test tests for the presence of a unit root and is accessible in the tseries library with the command adf.test().  You can read more about unit root testing in Chapter 5.2 of our text.