ARIMA
ARIMA
ARIMA
Abstract:
This work delves into the Autoregressive Integrated Moving Average (ARIMA) model, a fundamental
tool in time series analysis and forecasting. The paper explores various subtopics within ARIMA
modeling, including its advancements, applications across different domains, and the challenges
encountered in its implementation.
Introduction
ARIMA (Autoregressive Integrated Moving Average) is a statistical analysis model that uses time
series data to either better understand the data set or to predict future trends.
An ARIMA model can be understood by outlining each of its components as follows:
Autoregression (AR): refers to a model that shows a changing variable that regresses on its own
lagged, or prior, values.
Integrated (I): represents the differencing of raw observations to allow the time series to become
stationary (i.e., data values are replaced by the difference between the data values and the previous
values).
Moving average (MA): incorporates the dependency between an observation and a residual error
from a moving average model applied to lagged observations. 1
Time series forecasting predicts future data points based on observed data over a period known as the
lead-time. The purpose of forecasting data points is to provide a basis for production control and
production planning and to optimize industrial processes and economic planning. The major objective
is to obtain the best forecast, i.e. to ensure that the mean square of the deviation between the actual
and the forecasted values is as small as possible for each lead-time 2
Over the past few decades, much effort has been devoted to the development and improvement of
time series forecasting models. Traditional models for time series forecasting, such as the Box–
Jenkins or the Autoregressive Integrated Moving Average (ARIMA) model, assume time series data
are generated by linear processes. However, these models may be inappropriate if the underlying
mechanism is nonlinear. In fact, real-world systems are often nonlinear. The ARIMA model is a
stochastic process defined by three parameters, p, d, and q, where p stands for the Auto-Regressive
AR(p) process, d is the integration (needed for the transformation into a stationary stochastic process),
and q is the Moving Average MA(q) process.
Construction of an ARIMA model
1. Stationarize the series, if necessary, by differencing (& perhaps also logging, deflating, etc.)
2. Study the pattern of autocorrelations and partial autocorrelations to determine if lags of the
stationarized series and/or lags of the forecast errors should be included in the forecasting equation
3. Fit the model that is suggested and check its residual diagnostics, particularly the residual ACF and
PACF plots, to see if all coefficients are significant and all of the pattern has been explained.
4. Patterns that remain in the ACF and PACF may suggest the need for additional AR or MA terms.
ARIMA terminology
• A non-seasonal ARIMA model can be (almost) completely summarized by three numbers:
p = the number of autoregressive terms
d = the number of nonseasonal differences
q = the number of moving-average terms
• This is called an “ARIMA(p,d,q)” model
• The model may also include a constant term (or not) 3.
y′t = c+ϕ1y′t−1+⋯+ϕpy′t−p+θ1εt−1+⋯+θqεt−q+εt,
where y′t is the differenced series (it may have been differenced more than once).
The “predictors” on the right hand side include both lagged values of y t and lagged errors.
We call this an ARIMA( p , d , q ) model, where
p = order of the autoregressive part;
d = degree of first differencing involved;
q = order of the moving average part. 4
(1−ϕ1B)(1−Φ1B4)(1−B)(1−B4)yt = (1+θ1B)(1+Θ1B4)εt.
5
Case Studies and Real-World Implementations:-
1. RESEARCH ON COVID-19 EPIDEMIC BASED ON ARIMA MODEL
To predict the trajectory of the pandemic, various models are employed, including infectious disease
transmission dynamics models such as the SEIR and SIR models, as well as statistical models like
time series analysis. While transmission dynamics models require understanding complex parameters,
time series models, notably the Autoregressive Integrated Moving Average (ARIMA) model, offer
simplicity and strong short-term predictability. ARIMA models utilize historical case data to forecast
the number of infections, aiding in the planning and implementation of effective control measures.
Overall, the text underscores the ongoing battle against COVID-19, the importance of predictive
modeling in understanding its spread, and the utility of ARIMA and similar models in informing
public health responses 6.
The study explores the application of the Autoregressive Integrated Moving Average (ARIMA) model
in forecasting the total costs associated with a mining face drilling rig. The mining industry,
characterized by its dynamic operational environment and high equipment costs, presents a significant
challenge for cost forecasting. By leveraging historical data and ARIMA modeling techniques, this
study aims to provide accurate and reliable forecasts to support strategic decision-making and cost
management in the mining sector. 7
By Khurram Rashid