The History and Benefits of MA, AR, & ARIMA Models for Time Series Analysis

1.2 History of MA and AR models

Time series analysis first really took off by the research and use of MA (moving average) and AR (autoregressive) models in the 1920’s and 1930’s by G.U Yule and J. Walker. Herman Wold then combined these two methods to create the ARMA model, however it took until 1970 for the ARMA models to become widely used with the release of the book “Time Series Analysis” by G.E.P. Box and G.M. Jenkins. ARMA models are used when the data has a structure that has features of both MA and AR models, therefore neither one alone is sufficient at modelling the data. So, they must be combined to create a ARMA model. It is because of G.E.P. Box and G.M. Jenkins and their work that ARMA and ARIMA models are now commonly known as Box-Jenkins models.

MA and AR models have unknown parameters that must be estimated to fit the model to the respective data. If the order of the MA or AR model is known then we can estimate the parameters. This is only possible when the model order is known, if it is not known there are too many unknowns for us to solve. There are several different methods for parameter estimation for both MA and AR models. For an AR model the conditional least-squares method is commonly used to estimate the parameters. For MA models the maximum-likelihood estimation (MLE) is regularly used to estimate MA models. There are two different possible methods to calculating the MLE, the conditional-likelihood method where it is assumed that the initial shocks are zero, and the exact-likelihood method where the initial shocks are considered as parameters. When the sample size is large, both approaches are very close and will make little difference which one is used. If the model is close to being noninvertible then it is general practise to use the exact-likelihood estimates over the conditional-likelihood method. Generally for ARIMA models the conditional-likelihood method is used to estimate the parameters.

1.3 Exponential smoothing and ARIMA models

The two models most commonly used in time series analysis are exponential smoothing or ARIMA models. We will first look at exponential smoothing. Exponential smoothing models were first developed in the 1950’s by Robert Brown. but have undergone several improvements and additions since then. Exponential smoothing methods were first classified in 1969 but have since been extended. The latest extensions were made my Hyndman in 2002 and Taylor in 2003 so that today there are 15 different methods which have been classified, and for each of these 15 there are two possible state space models. One for a model with additive errors and one for multiplicative errors. These classifications mean that it is now easy to choose models and parameters in an automated way.

The simplest of Hyndman’s 15 methods is the simple exponential smoothing (N,N Method). We can forecast the next value by taking the previous value and adding the error term multiplied by a constant between 0 and 1. Where the error term is the actual value minus the forecast value of the previous term. This can be expanded by replacing each forecasted value with the equation for its forecast. This gives us a model for the next forecast value that represent a weighted moving average of all the past observations. We still need the other 14 methods Hyndman describes though, since the simple exponential smoothing method is particularly useful for data with no trend, seasonality, or other underlying structures.

It can be shown that linear exponential smoothing models are just special cases of ARIMA models. However, when looking at non-linear exponential smoothing models there are no equivalent ARIMA models. For non-stationary time series data sets, an exponential smoothing model is more suitable then an ARIMA model. On the other hand, ARIMA models are more suitable for linear data sets than the exponential smoothing method. In this report both exponential smoothing models and ARIMA models will be used on the data set. These two models will then be compared to each other and analysed as to which model gives a better forecast for our data set.

1.4 ARCH and GARCH models

Exponential smoothing works best if the data has a constant mean and no seasonality, ARIMA models are useful for short forecasting non-stationary data. For most time series data sets, exponential smoothing or ARIMA models caputure enough of the structure in the data to be an effective model. However, if the time series’ variance is volatile we need a new type of model to forecast these time series hence, (Generalized) Autoregressive Conditional Heteroscedastity, (G)ARCH models were introduced. Due to their use in modelling volatility ARCH and GARCH models are largely used in financial time series analysis, and more specifically in stock data. The ARCH(m) model provides a model where the variance at time t is conditional on the observations at the previous m times. Similarly, the GARCH model uses the values of the past squared observations and past variances to model the variance at time t.

The ARCH model was formulated in 1982 by Engle and the GARCH model by Bollerslev in 1986. ARCH models work on the idea that the shock, is uncorrelated but dependent, and that this dependency’s lagged values can be described by a simple quadratic equation. Due to the nature of this model it is expected that a large shock would be followed by a large shock, and similarly a small shock would be followed by a small shock. When looking at the ACF and PACF of a ARIMA or exponential smoothing model, both type of model can remove some of the significant spikes at low level lags. However, when you look at the ACF and PACF of the residuals squared of ARIMA and exponential smoothing models if there are lots of significant lags this is when we need to introduce ARCH and GARCH models. ARCH and GARCH models can remove these significant lags. When looking at the ACF of ARCH model data you would expect no correlation, and very few significant lags. For the PACF of the same data you would expect some lags to have spikes but these spikes to occur at a random frequency on the lag axis. To find the order of an ARCH model we need to use the PACF of the shock term squared, where the order is the largest significant lag from the PACF. Generally when looking at ARCH models the order tends to be quite large and therefore GARCH models are introduced to simplify the model and to keep the number of parameters to a minimum. A GARCH model can be considered as an application ARMA ideas on the shock squared series. The order of a GARCH model is written GARCH(a,b) where a is the number of ARCH terms that are in the equation and b is how many moving average lags are specified. It is generally agreed the GARCH(1,1) normally provides a good enough fit for most models. Higher orders are only really required when there is a lot of data in the data set.

1.5 Forecasting

There are multiple tests that can be applied to the errors to enable an effective comparison between models. There are two commonly used measures we can use when comparing data that has the same scale, the Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE). These two measurement can only be used if comparing data with the same scale, for example both in USD. There are occasions where we need to compare data that has different units, a good comparison tool for this is the Mean Absolute Percentage Error (MAPE), since this is using the percentage error is has no scale. The MAPE has limited effectiveness if the data contains small values. A second scale-independent test we can use is the Mean Absolute Scaled Error (MASE). The MASE is particularly effective when the training set is much larger than the test set.

Essay: The History and Benefits of MA, AR, & ARIMA Models for Time Series Analysis

Essay details and download:

Text preview of this essay:

About this essay:

Essay details and download:

Text preview of this essay:

About this essay:

Essay Categories: