Decomposition procedures are used in time series to describe the trend and seasonal factors in a time series. More extensive decompositions might also include long-run cycles, holiday effects, day of week effects and so on. Here, we’ll only consider trend and seasonal decompositions.
One of the main objectives for a decomposition is to estimate seasonal effects that can be used to create and present seasonally adjusted values. A seasonally adjusted value removes the seasonal effect from a value so that trends can be seen more clearly. For instance, in many regions of the U.S. unemployment tends to decrease in the summer due to increased employment in agricultural areas. Thus a drop in the unemployment rate in June compared to May doesn’t necessarily indicate that there’s a trend toward lower unemployment in the country. To see whether there is a real trend, we should adjust for the fact that unemployment is always lower in June than in May.
Basic Structures
The following two structures are considered for basic decomposition models:
- Additive: \(x_t\) = Trend + Seasonal + Random
- Multiplicative: \(x_t\) = Trend * Seasonal * Random
The “Random” term is often called “Irregular” in software for decompositions.
How to Choose Between Additive and Multiplicative Decompositions
- The additive model is useful when the seasonal variation is relatively constant over time.
- The multiplicative model is useful when the seasonal variation increases over time.
Example 5-1 Section
In Lesson 1.1, we looked at quarterly beer production in Australia. The seasonal variation looked to be about the same magnitude across time, so an additive decomposition might be good. Here’s the time series plot:
Example 5-2 Section
We’ve seen at least one example so far in the course where a multiplicative decomposition would be good – the quarterly earnings data for the Johnson and Johnson Corporations. The seasonal variation increases as we move across time. A multiplicative decomposition could be useful. Here’s the plot of the data:
Basic Steps in Decomposition Section
The seasonal effects are usually adjusted so that they average to 0 for an additive decomposition or they average to 1 for a multiplicative decomposition.
- The first step is to estimate the trend. Two different approaches could be used for this (with many variations of each).
- One approach is to estimate the trend with a smoothing procedure such as moving averages. (See Lesson 5.2 for more on that.) With this approach, an equation is not used to describe trend.
- The second approach is to model the trend with a regression equation.
- The second step is to “de-trend” the series. For an additive decomposition, this is done by subtracting the trend estimates from the series. For a multiplicative decomposition, this is done by dividing the series by the trend values.
- Next, seasonal factors are estimated using the de-trended series. For monthly data, this entails estimating an effect for each month of the year. For quarterly data, this entails estimating an effect for each quarter. The simplest method for estimating these effects is to average the de-trended values for a specific season. For instance, to get a seasonal effect for January, we average the de-trended values for all Januarys in the series, and so on. (Minitab uses medians rather than means, by the way.)
- The final step is to determine the random (irregular) component.
For the additive model, random = series – trend – seasonal.
For the multiplicative model, random = series / (trend*seasonal)The random component could be analyzed for such things as the mean location, or mean squared size (variance), or possibly even for whether the component is actually random or might be modeled with an ARIMA model.
A few programs iterate through the steps 1 to 3. For example, after step 3 we could use the seasonal factors to de-seasonalize the series and then return to step 1 to estimate the trend based on the de-seasonalized series. Minitab does this (and estimates the trend with a straight line in the iteration.
Decomposition in R
The basic command is decompose.
For an additive model decompose(name of series, type = "additive").
For a multiplicative decomposition decompose(name of series, type ="multiplicative").
Important first step: As a preliminary you have to use a ts command to define the seasonal span for a series.
For quarterly data, it might be name of series = ts(name of series, freq = 4).
For monthly data, it might be name of series = ts(name of series, freq = 12).
You can plot the elements of the decomposition by putting the decompose command as an argument of a plot command.
As an example,
plot(decompose(earnings, type = "multiplicative"))
Another way to plot is to store the results of the decomposition into a named object and then plot the object. As an example,
decompearn = decompose(earnings, type = "multiplicative")
plot(decompearn)
To see all elements of a stored object, simply type its name. For instance, entering decompearn will show all elements of the decomposition in the example above.
When the decomposition is stored in an object, you also have access to the various elements of the decomposition. For instance, in the example just given, decompearn\$figure contains the seasonal effects values for four quarters.
You could “print” the seasonal figures simply by entering decompearn\$figure. You could plot them using plot(decompearn\$figure).
Example 5-1 Continued: Additive Decomposition for Beer Production
The following commands produced the graph and numerical output that follows for the Australian beer production series.
Download the data: beerprod.dat
beerprod = scan("beerprod.dat")
beerprod = ts(beerprod, freq = 4)
decompbeer = decompose (beerprod, type = "additive")
plot (decompbeer)
decompbeer
The plot shows the observed series, the smoothed trend line, the seasonal pattern and the random part of the series.
The seasonal pattern is a regularly repeating pattern.
Here’s the numerical output:
Qtr1 | Qtr2 | Qtr3 | Qtr4 | |
---|---|---|---|---|
1 | 7.896324 | -40.678676 | -24.650735 | 57.433088 |
2 | 7.896324 | -40.678676 | -24.650735 | 57.433088 |
3 | 7.896324 | -40.678676 | -24.650735 | 57.433088 |
4 | 7.896324 | -40.678676 | -24.650735 | 57.433088 |
5 | 7.896324 | -40.678676 | -24.650735 | 57.433088 |
… same rows down to 18 … |
Qtr1 | Qtr2 | Qtr3 | Qtr4 | |
---|---|---|---|---|
1 | NA | NA | 255.3250 | 254.4125 |
2 | 257.4500 | 260.1000 | 262.8375 | 264.6875 |
3 | 265.4125 | 264.6500 | 262.4625 | 260.4000 |
4 | 261.2625 | 262.9875 | 266.1875 | 269.2375 |
5 | 270.5125 | 271.4625 | 272.1750 | 274.0125 |
6 | 274.3750 | 277.4500 | 278.9750 | 279.1750 |
7 | 282.9000 | 285.2875 | 287.9375 | 290.3875 |
… and so on to the 18th row … |
Qtr1 | Qtr2 | Qtr3 | Qtr4 | |
---|---|---|---|---|
1 | NA | NA | -3.77426471 | -3.44558824 |
2 | -3.34632353 | 8.47867647 | -2.08676471 | -1.72058824 |
3 | -1.40882353 | 8.82867647 | -0.81176471 | -4.43308824 |
4 | 7.75882353 | 4.49117647 | 8.36323529 | -12.37058824 |
5 | 7.69117647 | -4.28382353 | 12.87573529 | -20.04558824 |
6 | 12.42867647 | -4.17132353 | 2.87573529 | 2.59191176 |
7 | -11.69632353 | 5.19117647 | 6.51323529 | -2.12058824 |
… and so on to the 18th row … |
(1) | 7.896324 | -40.678676 | -24.650735 | 57.433088 |
---|
Using the Seasonal Values
The elements of \$figure are the effects for the four quarters.
The seasonal effect values are repeated each year (row) in the \$seasonal object at the top of this page.
The seasonal values are used to seasonally adjust future values. Suppose for example that the next quarter 4 seasonal value past the end of the series has the value 535. The quarter 4 seasonal effect is 57.433088, or about 57.43. Thus for this future value, the “de-seasonalized” or seasonally adjusted value = 535 − 57.43 = 477.57.
How the Trend Values Were Calculated
The trend values were determined as “centered“ moving averages of span 4 (because there are four quarters per year). Here’s how the centered moving average for time = 3 would be calculated.
Average the observed data values at times 1 to 4:
\(\dfrac{1}{4}(x_1+x_2+x_3+x_4)\)
Average the values at times 2 to 5:
\(\dfrac{1}{4}(x_2+x_3+x_4+x_5)\)
Then average those two averages:
\begin{multline} \dfrac{1}{2}\left(\dfrac{1}{4}(x_1+x_2+x_3+x_4)+\dfrac{1}{4}(x_2+x_3+x_4+x_5)\right) \\ \shoveleft{= \dfrac{1}{8}x_1+\dfrac{1}{4}x_2 + \dfrac{1}{4}x_3 +\dfrac{1}{4}x_4 + \dfrac{1}{8}x_5} \end{multline}
More generally, the centered moving average smoother for time t (with 4 quarters) is
\(\dfrac{1}{8}x_{t-2}+\dfrac{1}{4}x_{t-1} + \dfrac{1}{4}x_t +\dfrac{1}{4}x_{t+1} + \dfrac{1}{8}x_{t+2}\)
Following are the first 8 values in the observed series. The smoothed trend value for time 3 in the series (Qtr 3 of year 1) is 255.325 and the smoothed trend value for time 4 is 254.4125. Use the data below to verify these values (and your understanding of the procedure).
Qtr1 | Qtr2 | Qtr3 | Qtr4 | |
---|---|---|---|---|
1 | 284.4 | 212.8 | 226.9 | 308.4 |
2 | 262.0 | 227.9 | 236.1 | 320.4 |
For monthly data the centered moving average smoother for time t will be
\(\dfrac{1}{24}x_{t-6}+\left(\sum_{j=-5}^{5}\dfrac{1}{12}x_{t+j}\right)+\dfrac{1}{24}x_{t+6}\)
Example 5-1 Continued: Multiplicative Decomposition for Beer Production
The following two commands will do a multiplicative decomposition of the beer production series and print the seasonal effects.
decombeermult = decompose (beerprod, type = "multiplicative")
decombeermult$figure
The seasonal (quarterly) effects are:
1.0237877 | 0.8753662 | 0.9233315 | 1.1775147 |
To seasonally adjust a value, divide the observed value of the series by the seasonal factors. For example if a future quarter 4 value is 535, the seasonally adjusted value = 535/1.1775147 = 454.34677.
Lowess Seasonal and Trend Decomposition
A lowess smoother essentially replaces values with a “locally weighted” robust regression estimate of the value. The R command stl does an additive decomposition in which a lowess smoother is used to estimate the trend and (potentially) the seasonal effects as well. There are several parameters that can be adjusted, but the default does a fairly good job.
For our beer production example, the following command works:
stl(beerprod, "periodic")
The “periodic” parameter essentially causes the seasonal effects to be estimated in the usual way, as averages of de-trended values. The alternative to this is a s.window = some odd number of lags, which uses lowess smoothing procedures to estimate the seasonal effects based on a number of years = the specified s.window value. When you do this, the seasonal effects will change as you move through the series.
Here’s a piece of the result of the stl(beerprod, "periodic")
command.
The word “remainder” is used rather than “random.” (Maybe it’s not really random!)
seasonal | trend | remainder | ||
---|---|---|---|---|
1 Q1 | 8.06289 | 267.3569 | 8.9802489 | |
1 Q2 | -41.58529 | 261.8710 | -7.4856917 | |
1 Q3 | -24.68456 | 257.1444 | -5.5598227 | |
1 Q4 | 58.20698 | 253.8595 | -3.6664533 | |
2 Q1 | 8.06289 | 257.4133 | -3.4761521 | |
2 Q2 | -41.58529 | 260.6083 | 8.8769848 | |
2 Q3 | -24.68456 | 262.9967 | -2.2121517 | |
2 Q4 | 58.20698 | 264.2030 | -2.0099463 | |
3 Q1 | 8.06289 | 265.5992 | -1.7621096 | |
3 Q2 | -41.58529 | 265.2844 | 9.1008543 | |
3 Q3 | -24.68456 | 262.6567 | -0.9721191 | |
3 Q4 | 58.20698 | 259.6218 | -4.4287360 | |
...and so on for years 4 to 16... | ||||
17 Q1 | 8.06289 | 417.3459 | -6.2088201 | |
17 Q2 | -41.58529 | 420.2610 | -1.9757578 | |
17 Q3 | -24.68456 | 428.1381 | -10.6535322 | |
17 Q4 | 58.20698 | 435.8692 | 12.0238431 | |
18 Q1 | 8.06289 | 441.0740 | 9.2631448 | |
18 Q2 | -41.58529 | 447.1144 | -18.1291496 | |
18 Q3 | -24.68456 | 453.0243 | -1.4397012 | |
18 Q4 | 58.20698 | 459.3832 | 7.4098529 |
The additive seasonal effects are 8.06289, -41.58529, -24.68456, 58.20698.
These aren’t much different than what we got from the additive decompose. Those seasonal values were
\$figure
(1) 7.896324 -40.678676 -24.650735 57.433088
The command plot(stl(beerprod, "periodic")) gave the following plot.