Seasonal Modelling of Fourier Series With Linear Trend
Seasonal Modelling of Fourier Series With Linear Trend
Seasonal Modelling of Fourier Series With Linear Trend
6; November 2016
ISSN 1927-7032 E-ISSN 1927-7040
Published by Canadian Center of Science and Education
Received: July 28, 2016 Accepted: September 6, 2016 Online Published: October 26, 2016
doi:10.5539/ijsp.v5n6p65 URL: http://dx.doi.org/10.5539/ijsp.v5n6p65
Abstract
This work was motivated by the need to model a periodic time series function with linear trend. A Fourier series
representation with detrended linear function was proposed. In this representation, the time series 𝑋𝑡 is expressed as a
combination of the linear trend component and a linear combination of 𝑠 orthogonal trigonometric functions; where 𝑠 is
the number of seasons. The method was applied to a rainfall data and the proposed model was found to give a good fit.
Comparative study was carried out with the complete Fourier representation. Diagnostic checks revealed that the
proposed method performs better the pure Fourier approach.
Keywords: Fourier series, seasonal model, linear trend, periodogram, spectral density and white noise process.
1. Introduction
The usual autoregressive integrated moving average (ARIMA) models developed by Box and Jenkins (1970) has been
extensively used in modelling linear time series. The ARIMA models assume that the current observation depends on
weighted previous observations, weighted previous random shocks and the current shock. However, most time series
arising in nature do not assume linearity but rather, periodic or seasonal with linear trend. Seasonal time series contain a
seasonal phenomenon that repeats itself after a regular period of time. Such phenomena stem from factors such as weather,
which affects many business and economic activities, cultural events and graduation ceremonies. Series with seasonal
pattern cannot be adequately represented by ARIMA models. To analyze such series, Wold (1974) arranged the series in a
two dimensional table according to the season; and the totals and averages were computed. In Wold (1974) representation,
a time series is thought to consist of trend-cycle, seasonal and irregular components. To estimate these components,
several decompositions are usually involved. Box, Jenkins and Reinsel (2008) made an extension of the Box and Jenkins
(1970) ARIMA models to include the seasonal part and is called the seasonal autoregressive integrated moving average
(SARIMA) models. Despite these efforts, the models do not adequately represent most periodic series.
A better procedure extensively used for modelling periodic time series is the Fourier analysis. This method represent the
time series by a set of elementary functions called basis such that all functions under study can be written as linear
combinations of the elementary functions in the basis. These elementary functions involve the sine and cosine functions
or complex exponentials. The Fourier series approach describes the fluctuation of time series in terms of sinusoidal
behaviour at various frequencies. Despite the wider acceptability of the method, however, Fourier approach still suffers
some set backs. One major problem associated with it is the cumbersomeness in Fourier representation and non inclusion
of trend component. As will be seen in the methodology, the inconveniences in representing the time series is enormous if
we are to include all the terms required in Fourier series. This cannot go well with a series of large sample size because
representing all the terms will consume several pages and can be boring to both the researcher and the reader. Hence, there
is need to shorten the number of terms in the Fourier expression and give a summarized representation that adequately
describe the time series. This is the intent of this work. As earlier stated, seasonal variations in time series can be caused
by climatic factor and we are going to use rainfall data in our illustration.
2. Literature Review
The need for accurate rainfall prediction is necessary when considering the importance in which such information would
give for river control, reservoir operation, forestry interests, flood mitigation, etc. Due to the numerous benefits of rainfall
modelling and prediction, studies on rainfall analysis have been on the fore front in the research world.
Afshar, Joshua, Buckman, and Samuel (2014) modelled rainfall data using ARIMA model and artificial neural networks
(ANNs). The ARIMA was found to give discouraging result. However, with application of artificial neural network to the
65
http://ijsp.ccsenet.org International Journal of Statistics and Probability Vol. 5, No. 6; 2016
ARIMA component, the combined model was found to be adequate and forecast were generated.
Cenis (1989) studied temperature in solarized soil using Fourier analysis. He obtained the daily maximum and minimum
temperatures at two dept on daily basis for three summer periods. He used the values to fit sinusoidal equations which
accounted for 93% variation. The variation and the hourly mean differences between the measured temperatures were
calculated. The analysis gave an overall encouraging result.
Serangelo, Ferrari and De Luca (2011) applied non-homogeneous Poisson process to examine the seasonal effects of daily
rainfall. The modelling process involved the partitioning of observed daily rainfall data into calibration periods for the
estimation of parameters. Though the validation period for checking the occurrence process changed; the model which
was applied to the set of rain gauges placed at different geographical areas was shown to provide good fit.
Falahah and Suorapto (2010) carried out research on rainfall data using analytic factor method. The data was obtained
from 50 weather stations for a period of 30 years. The result was plotted on pattern factors to reveal dominant factor for
each region and inspection period. The method explained factors that influence rainfall in Indonesia and the reasons for
having relatively high humidity in one area than the other.
Necholas, Mahmood and Hazan (2013) modelled rainfall data amounts for agriculture planning using gamma distribution
models. Daily rainfall data of two stations having two different mean annual rainfalls were analyzed. Generalized linear
models were used to fit smooth regression curves. The mean amount of rain per rainy day was computed using the
estimates of parameters of the model for each day of the season. The adequacy of the fitted model was check by the
analysis of deviance residuals and was found to be satisfactory. Fourier approach was employed for comparative study. It
was discovered that though reasonable results were obtained, Fourier analysis was time consuming and boring. However,
Fourier series was found suitable in fitting gamma distribution for the determination of mean rain per rainy day.
Zakaria (2013) conducted a study on periodic and stochastic modelling of monthly rainfall and the periodicities were
determined. Stochastic components were estimated using the auto-regressive model approach. Residuals obtained from
the model were shown to follow a white noise process; thus indicating the adequacy of the fitted model.
Beatrice, Nasser, Afshar, Selaman and Fahmi (2014) analyzed data from eight rain gauge stations. Annual rainfall data for
27 years were computed with the Fourier series equation. The result was compared with that obtained from harmonic
series models. It was discovered that both models were capable of describing rainfall pattern and were able to provide
reasonable relationship between the simulated and the observed data.
Akpanta, Okorie and Okoye (2015) adopted SARIMA modelling of the frequency approach in analyzing monthly rainfall
data in Umuahia. Probability time series approach was considered. The original data plotted showed seasonality which
was removed by differencing. After subjecting the model to diagnostic checks, SARIMA (0,0,0)(0,1,1)12 was found to fit
the data well and was used for prediction.
3. Methodology
In this method, a periodic time series is first observed whether it contains a linear trend or not. Visual inspection of the raw
data plot can reveal this pattern. Assuming a linear trend is detected, a linear regression model of the form
𝑌𝑡 = 𝛽0 + 𝛽1 𝑡 + 𝑒𝑡 (1)
is first fitted to the data;
where 𝑌𝑡 is the observed time series, 𝑡 is the time points (𝑡 = 1,2, … , 𝑛), 𝑛 is the number of observations, 𝛽0 and 𝛽1
are the regression parameters, and 𝑒𝑡 is the error component.
Fitting the above model (1) to the data 𝑌𝑡 , we can obtain the estimate of the error component
̂0 − 𝛽
𝑒̂𝑡 = 𝑌̂𝑡 − 𝛽 ̂1 𝑡
which can be tested for randomness or white noise.
̂0 + 𝛽
After obtaining the trend equation ( i.e. 𝑌̂ = 𝛽 ̂1 𝑡 ), the main series 𝑌𝑡 is detrended by the expression
𝑦𝑡 = 𝑌𝑡 − 𝑌̂𝑡 = 𝑌𝑡 − 𝛽̂0 − 𝛽̂1 𝑡 (2)
The resulting series 𝑦𝑡 is then used to fit seasonal model using Fourier representation.
3.1 Fourier Series Representation of the Time Series 𝒚𝒕
Given a time series of 𝑛 observations, the Fourier representation is the set of 𝑞 orthogonal trigonometric functions
shown below:
𝑞
𝑦𝑡 = ∑𝑖=1(𝛼𝑖 cos2𝜋𝑓𝑖 𝑡 + 𝛽𝑖 sin2𝜋𝑓𝑖 𝑡 ) + 𝑒𝑡 (3)
66
http://ijsp.ccsenet.org International Journal of Statistics and Probability Vol. 5, No. 6; 2016
estimated by
𝑞
𝑦̂𝑡 = ∑𝑖=1(𝑎𝑖 cos2𝜋𝑓𝑖 𝑡 + 𝑏𝑖 sin2𝜋𝑓𝑖 𝑡 ) (4)
2 2
where 𝑞 = 𝑛⁄2, 𝑎𝑖 = ∑𝑛𝑡=1 𝑦𝑡 cos2𝜋𝑓𝑖 𝑡 , 𝑏𝑖 = ∑𝑛𝑡=1 𝑦𝑡 sin2𝜋𝑓𝑖 𝑡,
𝑛 𝑛
𝑒𝑡 ~𝑁𝐼𝐼𝐷(0, 𝜍 2 ); period = 𝑝𝑖 = 𝑛⁄𝑖 and 𝑓𝑖 = 𝑖⁄𝑛 is the 𝑖 𝑡ℎ harmonic of the fundamental frequency 1⁄𝑛.
3.2 The Peridogram
The periodogram is defined as the function of intensities 𝐼(𝑓𝑖 ) at frequency 𝑓𝑖 = 𝑖/𝑛 and is given as
𝑛
𝐼(𝑓𝑖 ) = (𝑎𝑖2 + 𝑏𝑖2 ) ; 𝑖 = 1,2, … , 𝑞.
2
Periodogram is the plot of the intensities against the frequencies or periods. The periodogram 𝐼(𝑓𝑖 ) is simply the sum of
squares associated with the pair of coefficients (𝑎𝑖 , 𝑏𝑖 ) and hence with the frequency 𝑓𝑖 or period 𝑝𝑖 . That is,
∑𝑞𝑡=1(𝑦𝑡 − 𝑦)2 = ∑𝑛/2
𝑡=1 𝐼(𝑓𝑖 ).
In the context at hand, the periodogram is used to determine the seasonality or periodicity of a time series. This is usually
indicated by the largest peak in the periodogram plot.
3.3 The Spectrum
The sample spectrum is obtained by allowing the frequency 𝑓 to vary continuously in the range 0 to 0.5 cycle so that the
periodogram can be re-defined as
𝑛
𝐼(𝑓) = (𝑎𝑗2 + 𝑏𝑗2 ) ; 0 ≤ 𝑓 ≤ 0.5.
2
67
http://ijsp.ccsenet.org International Journal of Statistics and Probability Vol. 5, No. 6; 2016
120
Fitting the full Fourier
500 series in equation (3) where 𝑞 = = 60 result in a residual variance of 11.23 and the Fourier
2
400
coefficients are displayed in Appendix C. The residual autocorrelation function is displayed in figure 5. Clearly, there is a
at lag 12 ( 𝜌𝑘 = −0.36). This shows that the residuals are correlated (at lag 12) and hence do not follow
Yt
significant spike300
a white noise process. The actual and estimate values plots displayed in figure 6 shows a low correlation between these
200
values. Thus, the full Fourier series, despite it cumbersome nature does not fit adequately to the data.
100
4.2 The Proposed Approach
4.2.1 Seasonality and
0 the Estimated Trend
Index 20
The raw data plot in figure 1 clearly shows the40existence 60 80
of seasonality 100
and trend. 120
This is indicated by the periodic pattern
and upward movement of the graph. Fitting the trend equation gives the Minitab output in table 1 below.
600
500
400
Yt
300
200
100
68
http://ijsp.ccsenet.org International Journal of Statistics and Probability Vol. 5, No. 6; 2016
Spectrum of yt
periods
120.0 10.9 5.7 3.9 2.9 2.4
18
16
14
12
10
0
0 10 20 30 40 50 60
scaled frequency
69
http://ijsp.ccsenet.org International Journal of Statistics and Probability Vol. 5, No. 6; 2016
600
500
400
Yt
300
200
100
1.0
0.8
Autocorrelation
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
10 20 30
Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ
1 -0.01 -0.08 0.01 10 -0.05 -0.47 8.44 19 0.03 0.29 20.11 28 -0.09 -0.87 27.79
2 0.14 1.52 2.41 11 0.11 1.11 9.99 20 -0.07 -0.65 20.79 29 0.02 0.15 27.83
3 0.09 0.96 3.41 12 -0.14 -1.39 12.51 21 -0.06 -0.54 21.27 30 0.01 0.09 27.85
4 0.13 1.36 5.46 13 0.13 1.30 14.81 22 -0.08 -0.73 22.13
5 0.05 0.54 5.80 14 0.01 0.11 14.83 23 0.01 0.07 22.14
6 -0.09 -0.92 6.79 15 0.13 1.30 17.25 24 -0.11 -1.06 24.04
7 -0.09 -0.93 7.83 16 -0.01 -0.08 17.26 25 -0.11 -1.07 26.02
8 0.03 0.28 7.93 17 -0.04 -0.40 17.50 26 -0.02 -0.18 26.08
9 -0.04 -0.43 8.16 18 0.13 1.28 19.97 27 0.05 0.43 26.41
70
http://ijsp.ccsenet.org International Journal of Statistics and Probability Vol. 5, No. 6; 2016
1.0
Autocorrelation 0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
10 20 30
Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ Lag Corr T LBQ
1 -0.09 -1.00 1.02 10 0.00 0.01 9.71 19 0.02 0.21 36.28 28 -0.07 -0.60 40.84
2 0.12 1.30 2.79 11 0.11 1.16 11.46 20 -0.09 -0.79 37.45 29 -0.02 -0.20 40.92
3 0.04 0.48 3.04 12 -0.36 -3.67 29.42 21 -0.02 -0.18 37.52 30 -0.08 -0.67 41.89
4 0.06 0.67 3.53 13 0.17 1.51 33.18 22 -0.09 -0.78 38.70
5 0.05 0.56 3.89 14 0.03 0.25 33.29 23 0.06 0.52 39.25
6 -0.15 -1.62 6.88 15 0.03 0.29 33.44 24 0.02 0.15 39.29
7 -0.14 -1.50 9.56 16 -0.02 -0.15 33.48 25 -0.07 -0.60 40.02
8 -0.01 -0.15 9.59 17 -0.09 -0.82 34.66 26 -0.00 -0.00 40.02
9 -0.03 -0.31 9.71 18 0.10 0.92 36.20 27 0.02 0.17 40.08
700
600
500
400
Yt
300
200
100
71
http://ijsp.ccsenet.org International Journal of Statistics and Probability Vol. 5, No. 6; 2016
Copyrights
Copyright for this article is retained by the author(s), with first publication rights granted to the journal.
This is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license
(http://creativecommons.org/licenses/by/4.0/).
72