Fuzzy Time Series Model Based On Weighted Association Rule For Financial Market Forecasting
Fuzzy Time Series Model Based On Weighted Association Rule For Financial Market Forecasting
Fuzzy Time Series Model Based On Weighted Association Rule For Financial Market Forecasting
DOI: 10.1111/exsy.12271
ORIGINAL ARTICLE
KEY W ORDS
financial market forecasting, fuzzy time series, multi‐order fuzzy rule, weighted association rules
1 | I N T RO D U CT I O N
Forecasting activities in our daily life are frequent and play an important role. Forecasting tools can help people to make decisions, such as the
weather prediction, the financial prediction, the stock prediction, and the enrolments of a school. The traditional forecasting methods can predict
the sequential problem but cannot deal with the problem, which relates linguistic values of historical data. Therefore, based on Zadeh's fuzzy sets
theory (Zadeh, 1965), Song and Chissom (1993a, 1993b, 1994) proposed fuzzy time series forecast model to deal with classical time series prob-
lems. They used two kinds of fuzzy time series forecast models, which are time‐invariant fuzzy time series model and time‐variant fuzzy time series
model, to forecast the enrolments of students of University of Alabama (Song & Chissom, 1993b, 1994). Next, Chen (1996) proposed a new fuzzy
time series to forecast the students' enrolments of Alabama University.
In financial predictions, predicting the gold price and exchange rates are very important investments. Hence, investment fund managers and
financial analysts predicted price by professional knowledge and investment analysis methods, such as technical analysis, fundamental analysis,
and time series models (Su & Cheng, 2016). In recent years, this type of forecasting problem in statistics, econometrics, and mathematical finance
has been studied extensively. There are many existing methods to help the user for predicting the price and the future index, those methods include
the work of Faff, Brooks, and Kee (2002), Huarng and Yu (2005), Shin and Sohn (2004), Wang (2002), and Yu (2005). Fuzzy time series model has
been used in several forecasting problems, such as forecasting university enrolment (Chen, 1996, 2002; Chen & Chung, 2006; Chen & Hsu, 2004),
temperature forecasting (Chen & Hwang, 2000), and stock index forecasting (Huarng & Yu, 2005; Cheng, Chang, & Yeh, 2006; Huarng, 2001a,
2001b; Huarng & Yu, 2003, 2004, 2006; Yolcu & Lam, 2017). Therefore, this study proposed an approach based on Apriori association rules for
financial market forecasting.
This study is based on conference paper (Liu, et al. 2011) to extend the proposed methods for solving forecast problems.
Expert Systems. 2018;e12271. wileyonlinelibrary.com/journal/exsy Copyright © 2018 John Wiley & Sons, Ltd 1 of 15
https://doi.org/10.1111/exsy.12271
2 of 15 CHENG AND CHEN
The main problem of past fuzzy time series models is subjectively to determine the length of intervals. Therefore, how to come up with a
model to confirm the appropriate interval length is necessary. In recent years, many fuzzy time series methods have been presented to solve
forecasting problems (Chen, 2002; Cheng, Chen, Teoh, & Chiang, 2008; Cheng, Cheng, & Wang, 2008; Chen & Chung, 2006; Chen & Hsu,
2004; Chen & Hwang, 2000; Huarng, 2001a, 2001b; Huarng & Yu, 2005; Huarng & Yu, 2003, 2004, 2006; Hwang, Chen, & Lee, 1998;
Lee, Wang, & Chen, 2007, 2008). Many of them had better forecasting performance than the traditional method, and those studies obtained
an appropriate interval length of linguistic value to improve the performance of forecasting. However, no literature reasonably reveal how to
determine the appropriate lower bound and upper bound of universe. If the upper bound of testing data is greater than the last language or
the lower bound of testing, data is less than the first language, the data will cause error for data fuzzification. Therefore, it is important to build
a more reasonable lower bound and upper bound of universe.
In weighted problem, Chen proposed the fuzzy time series method of fuzzy relations with equal weights (Chen, 1996). In fact, each fuzzy rela-
tion has different effects on prediction in the real world. The fuzzy relationship occurs more frequently, the degree of impact will be higher. Then,
Yu (2005) proposed the weighted fuzzy time series models, Yu's weighted models considered the recent fuzzy logistic rules, which are more impor-
tant than the past rules. Many previous studies have proposed different fuzzy time series models to deal with uncertain and vague data. A draw-
back of these models is that they do not appropriately consider the weights of fuzzy relations. Therefore, in order to solve the problem of the
weight of fuzzy relations, it is necessary to assign the appropriate weight.
Based on previous reasons mentioned, this study proposed a fuzzy time series model based on weighted Apriori's association rules, which has
three focuses:
1. Utilize objective method to determine automatically the length of interval and proposed spread partition algorithm to decide the linguistic
intervals.
The remaining content of this paper is organized as follows. Section 2 reviews the related literatures. In Section 3, the proposed model is intro-
duced by using the proposed algorithm for one‐order and multi‐order weighted fuzzy rules. Section 4 evaluates the performance of proposed
model and compares with the other models. In Section 5, the profits of data sets are shown. Section 6 is the conclusions and future work.
2 | R E LA T E D WO R K
In this section, the related literature included fuzzy sets theory, fuzzy time series, and association rule are briefly introduced.
1. Max‐membership principle
This scheme is limited to peak output function, the method is given by the algebraic expression
μc z* ≥μc ðzÞ for all z∈Z: (1)
2. Centroid method
This method is the most popular defuzzification method, it also called centre of area or centre of gravity. It is given by the algebraic expression
∫μC ðzÞ⋅zdz
z* ¼ : (2)
∫μC ðzÞ dz
CHENG AND CHEN 3 of 15
∑μC ðzÞ⋅z
z* ¼ : (3)
∑μC ðzÞ
4. Mean‐max membership
This method also called middle‐of maxima, it is closely related to max‐membership principle, except that the locations of the maximum mem-
bership can be non‐unique. It is given by the algebraic expression
aþb
z* ¼ : (4)
2
where fA is the membership function of fuzzy Set A, fA : U → [0, 1], ui is an element of fuzzy Set A, fA(ui) indicates the degree of membership of ui in
A, fA(ui) ∈ [0, 1] and 1 ≤ i ≤ n.
Let Y(t) (t = …, 0, 1, 2, …) be a subset of real numbers. Y(t) be the universe of discourse defined by fuzzy set fi(t). Let F(t) be a collection of
fi(t) (i = 1, 2, …), and defined as a fuzzy time‐series on Y(t) (t = …, 0, 1, 2, …).
If there exists a fuzzy relationship R(t − 1, t), the relationship can be expressed as F(t) = F(t − 1) × R(t − 1, t), where R(t − 1, t) rep-
resents a fuzzy relationship between F(t) and F(t−1), × is an operator, then F(t) is said to be caused by F(t − 1). When F(t − 1) = Ai and
F(t) = Aj, the FLR between F(t) and F(t − 1) can be denoted as Ai → Aj, where Ai and Aj are the left‐hand side and right‐hand side of
the fuzzy logical relationship (FLR).
Fuzzy time series model contains establishing fuzzy relationships, forecasting, and defuzzifying, these computation steps could briefly use
Chen's model (Chen, 1996) to introduce as follows:
Case 1. In the FLR sequence, there is only one FLR. If Ai → Aj, then the forecast value F(t) is equal to Aj.
Case 2. There has more than one FLR in the FLR sequence. If Ai → Ai, Aj,⋯, Ak, then F(t), forecast value, is equal to Ai, Aj, …, Ak.
Step 6. Defuzzify.
Apply the centre‐of‐gravity method to get the forecasting results. This method is the most often adopted method of defuzzification.
4 of 15 CHENG AND CHEN
3 | PROPOSED MODEL
The traditional time‐series models (i.e., statistical methods) could solve the sequence data problem but could not handle the uncertain and vague
information. To improve the problems, Song and Chissom (1993b) proposed fuzzy time series model to deal with uncertain and vague problems,
and then many researchers have proposed different fuzzy time series models (Chang, Lee, Liao, & Cheng, 2007; Chen, 1996; Cheng et al., 2006;
Hwang et al., 1998; Lee et al., 2008; Yu, 2005). According to these researches, this paper summarized the influence factors of fuzzy time series
models as follows.
1. Length of interval: The length of interval must influence the forecasting accuracy, furthermore, previous fuzzy time series models subjectively
defined the linguistic intervals, which will result in forecasting shortcomings, such as some linguistic intervals do not include any data and some
intervals include many data (Cheng et al., 2006; Chen & Hwang, 2000; Huarng, 2001b; Yu, 2005).
2. Lag period: In the past researches, the forecasting of stock price is based on lag periods data, and the lag periods is an important factor.
3. Weight of fuzzy rule: Chen's model (Chen, 1996) uses the centre of gravity method to defuzzify with the equal weights regards each fuzzy
rules as the same priority; however, each fuzzy rule has the different influence on the forecasting result. In the past, fuzzy time series
model concerned about the FLRs and ignored the correlation between the recent period prices (Cheng, Chen, et al., 2008). The investment
decisions are usually related on the recent stock prices; consequently, the forecasting stock price is strongly associated to the pervious
stock price.
Based on mentioned above, this study is based on conference paper (Liu et al., 2011) to extend the proposed methods for solving the
problems:
1. Proposed a reasonable algorithm to determine the appropriate lower bound and upper bound of universe, then the data could be adequately
spread in the linguistic intervals;
2. Based on the work of Teoh, Chen, Cheng, and Chu (2009), who have revealed that the estimated time lags for the two empirical databases are
one‐lagged period, however, this study will consider multi‐order fuzzy relation; and
3. Build fuzzy rule based on Apriori‐AR and calculate the weights of fuzzy relations for forecasting financial market problems.
3.1 | Algorithm
To compute the proposed model easily, this section proposes an algorithm with seven steps. The detailed description of each step is introduced as
follows:
Step 1. Define the universe of discourse and determine the length of intervals automatically.
First, define the universe of discourse U. Let U = [Dlow, Dup], where Dlow and Dup denote the lower value and the upper value of U. Partition U
into n equal intervals and the length is l of each interval as
CHENG AND CHEN 5 of 15
2 3
Dmax −Dmin Dmax −Dmin
þ
6 n n−1 7
l¼6
4
7;
5 (5)
2
where Dmin and Dmax are the minimum value and the maximum value of observation. Then, set n as the number what intervals will be
partitioned, and the length of intervals l can be calculated by Equation 5. According to the Miller's concept of limitation of humans on the capac-
ity of processing information, that is, the magical number seven, plus or minus two, this step partitions the universe of discourse into seven
equal‐length intervals referring linguistic expression to represent the observation conditions (Miller, 1956). After determining the length of inter-
vals, then use the proposed “A” (automatically determining lower bound and upper bound) to calculate the lower value Dlow and the upper value
Dup of the universe of discourse U.
8
> 1; x<a
>
>
>
< 1; a≤x<b
μL1 ðxÞ ¼ c−x : (6)
>
> ; b≤x<c
>
> c−b
:
0; x>c
8
>
> 0;
>
> x−a x<a
>
>
< ; a≤x<b
μLi ðxÞ ¼ b−a : (7)
>
> c−x ; b≤x<c
>
>
>
> c−b
: x>c
0;
8
> 0; x<a
>
>
>
< x−a; a≤x<b
μLn ðxÞ ¼ b−a : (8)
>
>
> 1; b≤x<c
>
:
1; x>c
9. else
Dmax
10. Dup ¼ þl
l
where μLi ðxÞ denotes that the membership value of crisp data x belongs to fuzzy sets Li, and a, b, c denote the lower bound, midpoint, and the upper
bond of intervals of Li, respectively. If the data meet two membership functions, the highest membership value is chosen and linguistic value is
determined.
o1 o1 o1 þ1 o1 þ1
Li ; μ Li → Lj ; μ Lj
o2 o2 o2 þ1 o2 þ1
Li ; μ Li → Lj ; μ Lj
⋮
ov ov ov þ1 ov þ1 (9)
Li ; μ Li → Lj ; μ Lj
⋮
on on on þ1 on þ1
Lm ; μ Lm → Lj ; μ Lj
where 1 ≤ i ≤ m, 1 ≤ j ≤ m, m is the total number of the linguistic values, and 1 ≤ v ≤ n, n is the total occurrence times in the same relation. Ov denotes
v times for occurrence of the same relation. Loi 1 is the antecedent of the rule Li → Lj for the first time occurrence, and Ljo1 þ1 is the next linguistic value
of Loi 1 . In this study, Li and Lj, respectively, represent the antecedent and consequent linguistic values of the relation rules.
The k orders fuzzy relations (k = 2, 3, …) are established from (k‐1) orders fuzzy rules. The k‐orders fuzzy relations can be defined as
Loi 1 −kþ1 ; μ Lio1 −kþ1 ; …; Lio1 −1 ; μ Lio1 −1 ; Loi 1 ; μ L0i 1 → Ljo1 þ1 ; μ Ljo1 þ1
Loi 2 −kþ1 ; μ Lio2 −kþ1 ; …; Lio2 −1 ; μ Lio2 −1 ; Loi 2 ; μ Loi 2 → Loj 2 þ1 ; μ Ljo2 þ1
(10)
Loi v −kþ1 ; μ Loi v −kþ1 ; …; Liov −1 ; μ Loi v −1 ; Loi v ; μ L0i v → Ljov þ1 ; μ Ljov þ1
Loi n −kþ1 ; μ Loi n −kþ1 ; …; Loi n −1 ; μ Loi n −1 ; Lomn ; μ Lomn → Loj n þ1 ; μ Loj n þ1 :
All the definitions of variables are the same with one‐order fuzzy relations.
n
o −kþ1 o −1 o þ1
W Loi s −kþ1 ; …; Lios þ1 ; Lti s ; Ljos þ1 ¼ ∑ min μ Li p
o
; …; μ Li p ; μ Li p ; μ Lj p (11)
p¼1
where n denotes the total number of the same fuzzy relations, k denotes k‐orders fuzzy relation, and s denotes s‐th relation. Loi s −kþ1 is the linguistic
value of | − k + 1|th antecedent part of s‐th relation. On the whole, the weight of fuzzy relation is obtained by computing the cardinality of k‐orders
fuzzy relation, and W does not limit to [0,1].
After computing the confidence of each fuzzy relation, select the one‐order fuzzy rules and k‐orders fuzzy rules. The confidence of fuzzy rela-
tion is calculated by Equation 12.
W Li →Lj
CLi →Lj ¼ ; (12)
nðLÞ
∑ W ðLi →Lk Þ
k¼1
where CLi →Lj denotes the confidence of fuzzy rule Li → Lj, and n(L) denotes the number of linguistic values. The threshold value for confidence
(denoted as α) can be automatically adapted by minimal root mean square error (RMSE), the selected fuzzy rules with weights are greater than
CHENG AND CHEN 7 of 15
α. In this study, use training data to build the fuzzified relations and named it as fuzzy relations. Applying the threshold value selects fuzzy relations,
all selected fuzzy relations are called fuzzy rules.
12 if>α then
13 Li → Lj in dataset FRule
14 until dataset Rule has no item
n1
∑ W p *D Lj
p¼1 Loi s ;Loj s þ1
F ðt þ 1Þ ¼ n1 ; (13)
∑ W p o þ1
p¼1 Loi s ;Lj s
where F(t + 1) denotes forecasted value, n1 denotes the total number of one period fitting fuzzy rules, and Wis the weights of the fitting fuzzy rules.
D(Lj) is a defuzzified value of Lj and the value is calculated by Equation 14.
aL þ bLj þ cLj
D Lj ¼ j ; (14)
3
8 of 15 CHENG AND CHEN
where aLj , bLj , and cLj are the lower bound, the midpoint, and the upper bound of the linguistic interval. If the antecedent of the forecasting data have
no fitting fuzzy rules, and the linguistic value is in the range of linguistic value for training rule sets, find the fuzzy rules which antecedent linguistic
value plus and minus one linguistic value, then use the fuzzy rules (Loi−1
v
,Loiþ1
v
) to forecast by Equation 13. If the antecedent of the forecasting data has
no fitting fuzzy rules, and the linguistic value (Loi v ) is out of all training rules, the one period forecasting value is gained by calculating the average of
the interval of linguistic value (Loi v ) by Equation 15.
ILui þ ILl i
Fðt þ 1Þ ¼ ; (15)
2
where ILui is upper value of the interval of linguistic value and ILl i is lower value of the interval of linguistic value.
n1
∑ W r1 os *D Lj Þ
r1 ¼1 Li ;Ljos þ1
F ðt þ 1Þ ¼ n1 n2 nk
∑ W r1 os þ ∑ W r2 o −1 þ … þ ∑ W rk o −kþ1
r 1 ¼1 Li ;Loj s þ1 r 2 ¼1
s Li ;Loi s ;Ljos þ1 s
r k ¼1 Li ;Loi s −kþ2 ;…;Loi s ;Loj s þ1
n2
∑ W r2 os−1 *D Lj
r 2 ¼1 Li ;Loi s ;Loj s þ1
þ n1 n2 nk þ …… (16)
∑ W r1 os þ ∑ W r2 o −1 þ … þ ∑ W rk o −kþ1
r 1 ¼1 Li ;Loj s þ1 r 2 ¼1
s Li ;Loi s ;Ljos þ1 s
r k ¼1 Li ;Loi s −kþ2 ;…;Loi s ;Loj s þ1
nk
∑ W r *D Lj
r¼1 Loi s −nþ1 ;Lios −nþ2 ;…;Loi s ;Loj s þ1
þ n1 n2 nk
∑ W r1 os þ ∑ W r2 o −1 þ … þ ∑ W rk o −kþ1
r 1 ¼1 Li ;Loj s þ1 r 2 ¼1
s Li ;Loi s ;Ljos þ1 s
r k ¼1 Li ;Loi s −kþ2 ;…;Loi s ;Ljos þ1
where n1, n2, and nk mean the total number of one, two, and k orders fitting fuzzy rules. If there does not exist any fitted fuzzy rules, F(t + 1) is
calculated by the same method of one‐order fuzzy rules forecast.
n
F −A
i i
MAPE ¼ i¼1 i
(17)
n
and
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
un
u
u ∑ ðAi −F i Þ2
t
RMSE ¼ i¼1 ; (18)
n
where Fi is the forecasting value of year i, Ai is the actual value of year i, and n is the total number of years. Use these criteria to evaluate the fore-
casting results of the proposed method with the existing methods.
Input: the dataset of the forecasting rule FRule, the weights of the fitting fuzzy rules W, D(Lj) defuzzified value of Lj, upper value of the interval of
linguistic value ILui , lower value of the interval of linguistic value ILl i
Output: forecasted value F(t + 1)
Method:
1. i←2
2. do
3. if ( i-1 = t )
CHENG AND CHEN 9 of 15
n1
∑ W p *D Lj
p¼1 Loi s ;Loj s þ1
4 F ðt þ 1Þ ¼ n1
∑ W p
p¼1 Loi s ;Ljos þ1
5. else
ILui þ ILl i
6. F ðt þ 1Þ ¼
2
7. end if
8. if ( i-3 = t-2 and i-2 = t-1 and i-1 = t )
n1
Fðt þ 1Þ ¼ ∑ W r1 os *D Lj Þ
r 1 ¼1 Li ;Ljos þ1
9. n1 n2 nk
∑ W þ ∑ W þ…þ ∑ W þ⋯
r
1 2 r k r
r 1 ¼1 o o þ1 r 2 ¼1 o −1 o o þ1 rk ¼1 o −kþ1 os −kþ2 o o þ1
L s ;L s L s ;L s ;L s L s ;L ;…;L s ;L s
i j i i j i i i j
10. else
ILui þ ILl i
11. F ðt þ 1Þ ¼
2
12. end if
13. until i = 13
This section provides model verification, performance evaluation, and comparison for the proposed model. The daily gold price data set is collected
from the Central Trust of China, and the exchange rates (U.S. Dollar vs. Taiwan Dollar) are got from the Central Bank of the Republic of China
(Taiwan) as verification data sets. Moreover, MAPE and RMSE are used as evaluation criteria and compare the proposed model with other fuzzy
time‐series models that include ARIMA (Box & Jenkins, 1976) and GRNN (Specht, 1991) and those was proposed by Chen (1996) and Yu (2005).
Note: The vertical dots denote that there are many similar data values in this Table.
10 of 15 CHENG AND CHEN
Note: The vertical dots denote that there are many similar data values in this Table.
L1 → L1 0.465116
L1 → L2 0.356589
L2 → L1 0.480620
L2 → L2 3.007752
L3 → L2 0.232558
L3 → L4 0.015504
⋮ ⋮
L33 → L34 1.565891
L34 → L34 0.906977
Note: The vertical dots denote that there are many similar data values in this Table.
L1, L1 → L4 0.015504
L1, L2 → L2 0.356589
L1, L4 → L3 0.015504
L2, L1 → L1 0.465116
L2, L1 → L2 0.015504
⋮ ⋮
L34, L33 → L33 0.519380
L34, L34 → L23 0.604651
Note: The vertical dots denote that there are many similar data values in this Table.
Note: The vertical dots denote that there are many similar data values in this Table.
CHENG AND CHEN 11 of 15
2005/4/1 16,843 ─ ─
2005/4/4 16,735 16,747 16,750
2005/4/7 16,837 16,794 16,793
2005/4/8 16,791 16,808 16,835
2005/4/11 16,869 16,903 16,895
2005/4/12 16,874 16,903 16,890
2005/4/13 16,862 16,903 16,890
2005/4/14 16,879 16,903 16,890
2005/4/15 16,779 16,808 16,806
2005/4/18 16,759 16,747 16,739
⋮ ⋮ ⋮ ⋮
2005/9/28 19,192 19,368 19,366
2005/9/29 19,467 19,368 19,366
Note: The vertical dots denote that there are many similar data values in this Table, and the symbol “─” denote no available data.
Note. (1) “Proposed A” denotes proposed one‐order model; “Proposed B” is proposed multi‐order model. (2) The ARIMA (2,1,1) is optimized lag‐period by
Eviews lag tested. (3) The highlighted entries denote the best performance in eight models for each year dataset.
by one‐order and multi‐order are 108.61 and 88.74, respectively; from the results, the proposed model gets the smaller RMSE than other models.
Therefore, the proposed method has made a great improvement in forecasting the gold price. The performance of three orders is better than one
order, because the previous three orders gold price all influence the forecast price. It can be said the proposed three‐order method is suitable for
gold price.
5 | PROFIT EVALUATIONS
In order to demonstrate the profit of the proposed model, the profit computation method proposed by Cheng and Wei (2009) are employed. The
profit computation method is defined as following:
jforecastðtÞ−actualðtÞj
if ≤α and forecast(t + 1) − actual(t) > 0,then sell.
actualðtÞ
Chen (1996) 1.20 0.493 1.39 0.503 1.20 0.459 1.15 0.383 1.24 0.425
Yu (2005) 0.41 0.176 0.27 0.269 0.60 0.239 0.30 0.132 0.78 0.300
Chang et al. (2007)—1‐order 1.31 0.659 0.87 0.334 0.11 0.053 0.27 0.095 0.86 0.345
Chang et al. (2007)—n‐order 0.56 0.243 0.54 0.199 0.11 0.053 0.27 0.095 0.86 0.345
ARIMA 0.50 0.565 0.15 0.233 0.22 0.261 1.75 2.653 0.30 0.343
GRNN (Specht, 1991) 1.14 0.999 1.12 0.663 1.73 0.460 0.69 0.531 1.15 0.821
Proposed A 0.22 0.094 0.26 0.107 0.18 0.079 0.08 0.040 0.34 0. 127
Proposed B 0.30 0. 144 0.31 0.127 0.21 0.089 0.15 0.071 0.44 0.187
Note. “Proposed A” denotes proposed one‐order model; “Proposed B” is proposed 3‐order model; and the highlighted entries denote the best performance
in eight models for each year dataset.
Models Profits
Note: The highlighted entries denote the best performance in eight models for each year dataset.
Note: The highlighted entries denote the best performance in eight models for each year dataset.
jforecastðtÞ−actualðtÞj
If ≤α and forecast(t + 1) − actual(t) < 0,
actualðtÞ
then buy,
where α is threshold parameter (the threshold parameter depends on the data set). For this reason, the definition of profit is defined as
Equation 19.
p q
profit ¼ ∑ ðactualðt þ 1Þ−actualðtÞÞ þ ∑ ðactualðtÞ−actualðt þ 1ÞÞ; (19)
ts ¼1 tb ¼1
where p is the total number of days for selling, q is the total number of days for buying, and ts and tb are the t‐th day for selling and buying.
Therefore, this paper uses one and half years (2004/04–2005/09) gold price data set and the 5 years (2004–2008) exchange rate data set to
compute the profit and compare the proposed model with Chen's model (1996) and Yu's model (2005) as Tables 9 and 10. From the com-
pared profit results, the proposed Models A and B have more profits than Chen's model (1996) and Yu's model (2005). From profit evaluation
of this study, we can find that the better forecast accuracy could get the better profit as Table 7–10, and the timing of sell/buy depends on
the optimal threshold parameter α in training data set. However, in stock market, the main return factor is investment strategy (threshold
value) for the better profit (Ma, Xiong, He, & Zhang, 2017). Ma et al. (2017) built a basic trading strategy using directional change approach,
to select the optimal threshold θ and study the market performance for different portfolios that consist of different estimated period and
investment period.
6 | C O N CL U S I O N S
This study has proposed a novel fuzzy time series forecasting model, the proposed model can be more objective and reasonable to fuzzify historical
data, and based on Apriori's AR theory to build the weighted high‐order fuzzy rules. From the experiments of the gold price and the exchange rate
CHENG AND CHEN 13 of 15
data sets, the results shown that the proposed Models A and B outperform the listing models under MAPE, RMSE, and profit criteria. The reason
why the proposed model is better than the listing models, there are four facets as following:
After objectively calculated the length of interval by Equation 5, the proposed A automatically decides the lower/upper bound, and the dis-
course of universe is more reasonable than the past methods. That is, the proposed A objectively handles linguistic intervals, because it partitions
the linguistic intervals based on the distribution of observations. Therefore, the proposed A is more objective than Chen's model (Chen, 1996) and
Yu's model (Yu, 2005).
Association rules can find the relationship between the previous orders and next order. In financial markets, the price not only influenced by
previous order but also the change of the previous price is one of the most important factor for forecasting next order (Cheng & Wei, 2009). From
the experimental results, three previous orders price influence the next price in the gold market. In the exchange rate data set, the next price had
affected by a previous orders price. Therefore, this paper utilizes association rules to forecast with better performance than other methods in
financial market.
The weight of each fuzzy rule is obtained by calculating the cardinality with the same fuzzy relations. Nevertheless, Chen (1996) uses the cen-
tre of gravity method to defuzzify with the equal weights. For this reason, the weights of association rules are more reasonable than Chen's model
(Chen, 1996).
ARIMA is linearly time series models. However, in real worlds, there are many uncertainty and nonlinear interaction. The proposed model uses
the nonlinear rule‐based method (fuzzy sets theory) to deal this problem; therefore, the proposed method can get better performance than ARIMA.
Although the proposed method has made great improvement, still has room to improve. In future work, we can do more improvements: (a) In
order to prove its generality, the proposed model can be applied in other area, such as energy (Sadaei, Guimarães, da Silva, Lee, & Eslami, 2017) and
medical (Zhang et al., 2016). (b) This study, the high‐order forecasting, is combined the previous three orders data to forecast next order. In future
work, we can utilize more than three orders to forecast.
ORCID
RE FE R ENC E S
Agrawal, R., & Srikant, R. (1994). Fast algorithm for mining association rules. Proceedings of the VLDB conference, 487‐499.
Box, P., & Jenkins, G. M. (1976). Time series analysis: Forecasting and control. San Francisco, CA: Holdenday Inc.
Chang, J. R., Lee, Y. T., Liao, S. Y., & Cheng, C. H. (2007). Cardinality‐based fuzzy time series for forecasting enrollments. Lecture Notes in Computer Science,
4570, 735–744.
Chen, S. M. (1996). Forecasting enrollments based on fuzzy time series. Fuzzy Sets and Systems, 81, 311–319.
Chen, S. M. (2002). Forecasting enrollments based on high‐order fuzzy time series. Cybernetics and Systems, 33, 1–16.
Chen, S. M., & Chung, N. Y. (2006). Forecasting enrollments using high‐order fuzzy time series and genetic algorithms. International Journal of Intelligent Sys-
tems, 21, 485–501. Ku
Chen, S. M., & Hsu, C. C. (2004). A new method to forecast enrollments using fuzzy time series. Applied Science and Engineering, 2, 234–244.
Chen, S. M., & Hwang, J. R. (2000). Temperature prediction using fuzzy time series. IEEE Transactions on Systems Man and Cybernetics Part B, 30(2), 263–275.
Cheng, C. H., Chang, J. R., & Yeh, C. A. (2006). Entropy‐based and trapezoid fuzzification‐based fuzzy time series approaches for forecasting IT project cost.
Technological Forecasting and Social Change, 73(5), 524–542.
Cheng, C. H., Chen, T. L., Teoh, H. J., & Chiang, C. H. (2008). Fuzzy time‐series based on adaptive expectation model for TAIEX forecasting. Expert Systems
with Applications, 34, 1126–1132.
14 of 15 CHENG AND CHEN
Cheng, C. H., Cheng, G. W., & Wang, J. W. (2008). Multi‐attribute fuzzy time series method based on fuzzy clustering. Expert Systems with Application, 34(2),
1235–1242.
Cheng, C. H., & Wei, L. Y. (2009). Fusion ANFIS models based on multi‐stock volatility causality for TAIEX forecasting. Neurocomputing, 72(16–18),
3462–3468.
Faff, R. W., Brooks, R. D., & Kee, H. Y. (2002). New evidence on the impact of financial leverage on beta risk: A time‐series approach. North American Journal
of Economics and Finance, 13, 1–20.
Huarng, K. (2001a). Effective lengths of intervals to improve forecasting in fuzzy time series. Fuzzy Sets and Systems, 123, 387–394.
Huarng, K. (2001b). Heuristic models of fuzzy time series for forecasting. Fuzzy Sets and Systems, 123(3), 369–386.
Huarng, K., & Yu, H. K. (2003). An N‐th order heuristic fuzzy time series model for TAIEX forecasting. International Journal of Fuzzy Systems, 5(4), 247–253.
Huarng, K., & Yu, H. K. (2004). Type 2 fuzzy time series for TAIEX forecasting. Paper presented at the Taiwan–Japan symposium 2004 on fuzzy system and
innovational computing.
Huarng, K., & Yu, H. K. (2005). A type 2 fuzzy time‐series model for stock index forecasting. Physica A, 353, 445–462.
Huarng, K. H., & Yu, H. K. (2006). The application of neural networks to forecast fuzzy time series. Physica A, 336, 481–491.
Hwang, J. R., Chen, S. M., & Lee, C. H. (1998). Handling forecasting problems using fuzzy time series. Fuzzy Sets and Systems, 100, 217–228.
Kantardzic, M. (Ed.) (2002, October). Data mining: concepts, models, methods, and algorithms. New Jersey: Wiley‐IEEE Press.
Lee, L. W., Wang, L. H., & Chen, S. M. (2007). Temperature prediction and TAIFEX forecasting based on fuzzy logical relationships and genetic algorithms.
Expert Systems with Applications, 33(3), 539–550.
Lee, L. W., Wang, L. H., & Chen, S. M. (2008). Temperature prediction and TAIFEX forecasting based on high‐order fuzzy logical relationships and genetic
simulated annealing techniques. Expert Systems with Applications, 34, 328–336.
Liu J. W., Cheng C. H., Su C. H., Tsai M. C., “Fuzzy multiple‐periods time‐series model based on GSP and AR for forecasting gold price”, Advanced Materials
Research, Vols. 211–212, pp. 1124–1128, 2011 (2011 INTERNATIONAL CONFERENCE ON MECHATRONICS AND INTELLIGENT MATERIALS,
Guilin, China, 2011).
Ma, J., Xiong, X., He, F., & Zhang, W. (2017). Volatility measurement with directional change in Chinese stock market: Statistical property and investment
strategy. Physica A: Statistical Mechanics and its Applications, 471, 169–180.
Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity of processing information. Psychological Review, 63, 81–97.
Ross, T. J. (Ed.) (2000). Fuzzy logic with engineering applications (International ed.). New York: McGraw‐Hill.
Sadaei, H. J., Guimarães, F. G., da Silva, C. J., Lee, M. H., & Eslami, T. (2017). Short‐term load forecasting method based on fuzzy time series, seasonality and
long memory process. International Journal of Approximate Reasoning, 83, 196–217.
Shin, H. W., & Sohn, S. Y. (2004). Segmentation of stock trading customers according to potential value. Expert Systems with Applications, 27, 27–33.
Singh, P., & Borah, B. (2013). An efficient time series forecasting model based on fuzzy time series. Engineering Applications of Artificial Intelligence, 26(10),
2443–2457.
Song, Q., & Chissom, B. S. (1993a). Forecasting enrollments with fuzzy time series—Part I. Fuzzy Sets and Systems, 54(1), 10.
Song, Q., & Chissom, B. S. (1993b). Fuzzy time series and its models. Fuzzy Sets and Systems, 54(3), 269–277.
Song, Q., & Chissom, B. S. (1994). Forecasting enrollments with fuzzy time‐series—Part II. Fuzzy Sets and Systems, 62, 1–8.
Specht, D. (1991). A general regression neural network. IEEE Transactions on Neural Networks., 2, 568–576.
Su, C. H., & Cheng, C. H. (2016). A hybrid fuzzy time series model based on ANFIS and integrated nonlinear feature selection method for forecasting stock.
Neurocomputing, 205, 264–273.
Teoh, H. J., Chen, T. L., Cheng, C. H., & Chu, H. H. (2009). A hybrid multi‐order fuzzy time series for forecasting stock markets. Expert Systems with Applica-
tions, 36, 7888–7897.
Wang, Y. F. (2002). Predicting stock price using fuzzy grey prediction system. Experts Systems with Applications, 22, 33–39.
Yolcu, O. C., & Lam, H. K. (2017). A combined robust fuzzy time series method for prediction of time series. Neurocomputing, 247, 87–101.
Yu, H. K. (2005). Weighted fuzzy time series models for TAIEX forecasting. Physica A, 349, 609–624.
Zadeh, L. A. (1965). Inform. Control. Fuzzy Sets and Systems, 8, 338–353.
Zadeh, L. A. (1975a). The concept of a linguistic variable and its application to approximate reasoning—Part I. Information Science, 8, 199–249.
Zadeh, L. A. (1975b). The concept of a linguistic variable and its application to approximate reasoning II. Information Science, 8, 301–357.
Zadeh, L. A. (1976). The concept of a linguistic variable and its application to approximate reasoning III. Information Science, 9, 43–80.
Zadeh, L. A. (1988). Fuzzy Logic. IEEE Computer, 21, 83–93.
Zhang, T., Zhang, X., Liu, Y., Luo, Y., Zhou, T., & Li, X. (2016). The analysis of infectious disease surveillance data based on fuzzy time series method. Inter-
national Journal of Infectious Diseases, 45, 309–310.
Ching‐Hsue Cheng received the bachelor's degree in Mathematics from Chinese Military Academy in 1982, the master's degree in Applied
Mathematics from Chung‐Yuan Christian University in 1988, and the PhD degree in System Engineering and management from National
Defense University in 1994. He is now a professor of information management department in National Yunlin University of Science and Tech-
nology. His research is mainly in the field of fuzzy logic, fuzzy time series, soft computing, reliability, medical information management, and data
mining. He has published more than 200 papers (include 195 significant journal papers).
CHENG AND CHEN 15 of 15
Chung‐Hsi Chen was born in Tainan, Taiwan. He got his master's degree in Management Information System from National Yunlin University of
Science and Technology, Taiwan. Currently he is a PhD student at the National Yunlin University of Science and Technology, Taiwan, and now
his is a teacher of at Nanzih elementary school, Tainan.
How to cite this article: Cheng C‐H, Chen C‐H. Fuzzy time series model based on weighted association rule for financial market forecast-
ing. Expert Systems. 2018;e12271. https://doi.org/10.1111/exsy.12271