The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009)
Cincinnati, Ohio, USA - June 14 - 17, 2009
Generating Dynamic Fuzzy Models for Prediction
Problems
Juan Contreras
Oscar Acuña
Department of Naval Engineering
Escuela Naval Almirante Padilla
Cartagena, Colombia
epcontrerasj@ieee.org
Department of Electrical Engineering
Universidad Tecnológica de Bolívar
Cartagena, Colombia
oacuna@unitecnologica.edu.co
Abstract— In this paper we present a new method to generate
interpretable fuzzy systems from training data. A fuzzy system is
developed for nonlinear systems modeling and for system state
forecasting. The antecedent partition uses triangular sets with 0.5
interpolations avoiding the presence of complex overlapping that
happens in other methods. Singleton consequents are employed
and least square method is used to adjust the consequents. This
approach is not a hybrid system and does not employ other
techniques, like neural network or genetic algorithm. Two
benchmark problems have been used to illustrate our approach:
the first one is an input-output NARMAX model, which is one of
the most popular models in the neural and fuzzy literature; the
second one is the chaotic, nonperiodic and nonconvergence
Mackey-Glass series, commonly used to evaluate a time series
forecasting scheme.
Keywords: fuzzy identification;
interpretability; dynamic systems
I.
least
squares
method;
INTRODUCTION
One of the first proposals to automatically design a fuzzy
system from data was the table look-up écheme [1]. However,
when the number of inputs and membership functions are huge,
the number of fuzzy rules increases exponentially [2]. Sugeno
and Yasukawa [3] proposed a methodology to identify fuzzy
model parameters using singleton consequents, but it requires
many rules and presents a poor description capacity.
Different approaches have been proposed to generate fuzzy
models from input-output data [3]-[10], but they typically seek
for a good accuracy while interpretability of the fuzzy model is
not their first concern. In fuzzy systems, it is necessary that the
resulting fuzzy models have some transparency, i.e., that their
information be interpretable, so as to permit a deeper
understanding of the system under study [8].
Interpretability is defined for at least five criteria [8], [9]:
a.
Distinguishability. The membership functions should
be clearly different and each linguistic label should
have semantic meaning.
b.
Any element from the universe of discourse should
belong to at least one of the fuzzy sets.
c.
Due to the fact that each linguistic label has semantic
meaning, at least one of the values in the universe of
978-1-4244-4577-6/09/$25.00 ©2009 IEEE
discourse should have a membership degree equal to
one. In other words, all the fuzzy sets should be
normal.
d.
The numbers of membership functions should not
exceed the limit of 9 distinct terms.
e.
The number of rules should be limited according to
human cognitive issues.
Fig. 1 shows three membership functions where a strong
overlapping occurs and, because of that, it becomes very
difficult to label these membership functions. This occurs
frequently when neural network or genetic algorithm is used
during the training process.
Time series prediction is a problem with a wide range of
applications, including energy systems planning, flood
forecast, traffic control, stock exchange operations and weather
prediction. Accordingly, a number of different prediction
approaches have been proposed [2], [11]-[14]. Basically, time
series prediction can be considered as a modeling problem. The
first step is establishing a mapping between input(s) and
output(s). Usually, the mapping is nonlinear and chaotic. After
such a mapping is set up, it is used to predict future values
based on past and current observations [13].
This paper presents a new approach for the development of
linguistically interpretable fuzzy models from data. The
approach has been used in system identification [15], [16] but
in this time it will be used as a fuzzy predictor.
The methodology used in this paper to get the fuzzy model
from input and output data is presented in two phases: At the
first, the inference error method is presented and then the fuzzy
identification algorithm to generate an interpretable fuzzy
model; at the second phase, the method is applied to two well
known benchmark classics: the first one is an input-output
NARMAX model, which is one of the most popular models in
the neural and fuzzy literature [2], [17]; the second one is the
chaotic, nonperiodic and nonconvergence Mackey-Glass series,
commonly used to evaluate a time series forecasting scheme
[12].
The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009)
Cincinnati, Ohio, USA - June 14 - 17, 2009
same numeric value after a defuzzyfication method has been
applied [19]. In addition, the overlapping in 0.5 assures that the
supports of the fuzzy sets are different. The fuzzy sets
generated by the output variable will be a singleton
2) Distribution of the Membership Functions
The triangular fuzzy sets of input variables will be
distributed symmetrically at each respective universe
3) Operators
For combining the antecedents OWA operators will be used
[21].
Fig. 1. No interpretable distribution of membership functions
II.
4) Inference Method
L
∑y
FUZZY IDENTIFICATION APPROACH
A. Inference Error
A fuzzy rule: “if u is A, then y is B ”, where u and y
represent two numeric variables, and A ⊂ U and B ⊂ Y, are
two fuzzy input and output sets respectively, defined at the
universes U and Y, is equivalent to the equation:
u A ( u ) ≤ u B ( y)
(1)
f ( x (i ) ) =
( )
m j x (i )
j =1
L
(5)
( )
∑ m j x (i )
j =1
where
m j ( x (i ) ) = u
The inference error ε, is given by
0
u A (u ) ≤ u B ( y)
⎧
∈≈ ⎨
⎩u A (u ) − u B ( y ) … u A (u ) > u B ( y )
j
j
A1
( x1(i ) ).u
j
A2
( x 2(i ) ).….u
j
An
( xn(i ) )
(6)
j
(2)
is the output grade of the jth rule of a Sugeno fuzzy system, y
is the singleton value corresponding to rule j. Because we use
fuzzy partition with normalized triangular sets with specific
L
A fuzzy rule of the kind “If u is A, then y is B” with a null
inference error, must fulfill the condition
overlapping of 0.5,
∑ m (x( ) ) = number of input variables.
i
j
j =1
Then, (5) can be expressed as
u A ( u ) = u B ( y)
(3)
L
∑ y m (x( ) )
j
i
j
In this approach, the consequent u B ( y) will be a singleton.
It is because fuzzy models with singleton consequents can be
easily understood and adjusted [18].
If the system has n inputs, it must be represented by rules of
the kind “If u1 is A1 and u2 is A2 and …and um is Am, then y is
B”, and the generated system must fulfill the condition
((u A1 ( x k ) ∧ (u A 2 ( x k ) ∧ ... ∧ (u A m ( x k )) = u B ( y k )
(4)
f ( x (i ) ) =
1) Membership Functions
The universe partitioning of the input variables in the
learning process will be done with normalized triangular sets
with specific overlapping of 0.5. The triangular membership
functions allow the reconstruction of the linguistic value at the
p
when p is the number of input variables.
C. Fuzzy Identification Algorithm
Given a collection of experimental input and output data
{xk, yk}, k =1, ..., N, where xk is the p-dimensional input array
x 1k , x k2 , … , x kn and y yk is the one-dimensional output array, the
algorithm is defined by the following steps:
1.
where ∧ represents a t-norm, or an aggregation operator,
of fuzzy logic
B. Fuzzy Model Structure
(7)
j =1
Organization of the N pair set of input – output data
with i = 1...N ; k = 1,..., p ,, where
x k( i ) , y ( i )
{
}
(i )
x ∈ ℜ are input arrays and y are output scalars.
See Fig. 2.
(i )
k
2.
p
Determination of universe ranges of each variable,
according to maximum and minimum values of
associated data xk− , xk+ , y − , y + .
[
] [
]
The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009)
Cincinnati, Ohio, USA - June 14 - 17, 2009
Fig. 4. Generating consequents
Fig. 2. Data organization
3.
4.
Distribution of triangular membership functions over
each universe. As a general condition the vortex with
ownership value one (modal value) falls at the middle
of the region covered by the membership function
while the other two vortexes, with membership values
equal to zero, fall in the middle of the two neighboring
regions. See Fig 3
Calculate the position of the modal values from the
input variable(s), according to
if u ( n ) ( x k(i ) ) = 1
A
(8)
k
ysk( n ) = y[i ]
end
where ysk(n ) corresponds to the projection over the
(i )
output space of data x evaluation of the k-th input
variable at the n-th set of the corresponding partition.
The output value corresponding to this projection is
given by the value of the i-th position of output array
y. See Fig. 4.
5.
Rule determination. Initially, the number of rules are
equal to the number of sets of each input variable
multiplied by the number of variables; in other words,
n × k . The membership function associated with a
consequent will be the antecedent of this rule.
Antecedents of rules with the same consequent are
merged using OWA operator, thus reducing the
number of rules as seen in Fig. 5.
6.
Model validation
described by (7)
using
the
inference
method
7.
Parameters adjust, relocating the output singletons
using the least squares method. Equation (7) can be
expressed in the form:
( )
f x (i ) =
L
∑y
j
w j ( x (i ) )
(8)
j =1
where
( )
wj x
(i )
( )
m j x (i )
=
= wij
p
(9)
Output values can be represented as Y = Wθ + E , that,
in matrix form is given by
⎡ y1 ⎤ ⎡ w11
⎢ 2⎥ ⎢ 2
⎢ y ⎥ = ⎢ w1
⎢ ⎥ ⎢
⎢ L⎥ ⎢ n
⎢⎣ y ⎥⎦ ⎢⎣ w1
Y
w12
w22
w2n
W
1
… w1L ⎤ ⎡ y ⎤ ⎡ e1 ⎤
⎢
⎥ 2⎥ ⎢ ⎥
… wL2 ⎥ ⎢ y ⎥ ⎢e2 ⎥
+
⎥⎢ ⎥ ⎢ ⎥
⎥
⎢
⎥
⎢ ⎥
… wLn ⎥⎦ ⎢⎣ y L ⎥⎦ ⎣en ⎦
θ
(10)
E
where E is the approximation error, which should be
minimized. Using the quadratic error norm, we have:
W TY
= (W T W )−1W T Y
(11)
W TW
The solution is valid if and only if
(12)
rank (W T W ) −1 = dim(θ )
Equation (12) implies that all rules have to receive
enough excitation during training but in practice it is
not always guaranteed so the application of this
method will generate catastrophic results for those
rules with low excitation and a significant bias in the
rules with sufficient excitation. This problem can be
solved using recursive least square (RLS). The
algorithm RLS looks like [9]
θ=
Fig. 3. Triangular sum-1 partition
Fig. 5. Generating rules
The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009)
Cincinnati, Ohio, USA - June 14 - 17, 2009
θ ( k + 1) = θ (k ) + γ (k )[ y (k + 1) − Wk +1θ (k )]
where
{
Wk = w1k , w2k ,… wLk
γ (k ) =
}
P( k )Wk +1
Wk +1 P (k )WkT+1 + 1
P (k + 1) = [1 − γ ( k )Wk +1 ]P (k )
(13)
(14)
(15)
(16)
with the initial value P(0) obtained in step 4 of this
algorithm.
8.
Finish if either the square error measure MSE is not
greater than a previously established value or the
number of membership functions is more than 9. In
any other case, increment by 1 the number of sets in
the input variable (the number of partition member)
and turn back to step 3.
III.
RESULTS
A. Example I
The nonlinear plant to be identified is guided by the
following difference equation:
y(k + 1) =
y(k ) y(k − 1) y(k − 2)u (k − 1)[ y(k − 2) − 1] + u (k )
1 + y(k − 1) 2 + y(k − 2)2
Fig. 6. The root mean square error RMSE vs the number of triangular
membership functions for Example I
Because of the limit in the number of pages of this paper a
structure of a fuzzy model with 3 membership functions is
showed in Fig. 7. It is not difficult to assign linguistic terms to
each other membership functions. The labels S, M and B
denote, respectively, the linguistic terms “small,” “medium,”
and “big”. The RMSE achieved is 0.0398 in this case. With 5
input variables and 3 triangular membership functions for each
input variable, 15 consequents (and 15 rules) are generated.
(step 5 in fuzzy identification algorithm). The output of the
obtained fuzzy model and the testing results are shown in Fig.
8.
(17)
Output y(k + 1) depends on two previous inputs [u(k), u(k1)] and on three previous outputs [y(k), y(k − 1), y(k − 2)]. The
following input is used for test
u (k ) = sin(πk / 25),
k < 250
250 ≤ k < 500
= 1.0
500 ≤ k < 750
= −1.0
= 0.3 sin(πk / 25) + 0.1sin(πk / 32)
750 ≤ k < 1000
+ 0.6 sin(πk / 10)
The five input variables u(k), u(k-1), y(k), y(k − 1) and y(k
− 2) are fed as input. Fig. 6 shows the root mean square error
(RMSE) achieved vs the number of membership functions.
With 7 membership functions a RMSE of 0.0247, lower than
the RMSE of 0.0265 achieved in [17].
Fig. 7. Partitions of u(k-1), u(k), y(k-2), y(k-1) and y(k) with 3 triangular
membership function s
The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009)
Cincinnati, Ohio, USA - June 14 - 17, 2009
B. Example 2
Now we apply our approach to one of the most commonly
used benchmark in system identification: the prediction of the
Mackey-Glass time series described by
0 . 2 x (t − τ )
x (t ) =
− 0.1x (t )
(9)
1 + x10 (t − τ )
The array of singleton consequents is:
⎡− 0.5950⎤
⎢ 0.0005 ⎥
⎢
⎥
⎢ 0.5318 ⎥
⎢
⎥
⎢ 0.0677 ⎥
⎢− 0.0297⎥
⎢
⎥
⎢ − 0.1008⎥
⎢− 0.5099⎥
⎢
⎥
θ (k ) = ⎢ 0.0481 ⎥
⎢ 0.3990 ⎥
⎢
⎥
⎢ 0.0205 ⎥
⎢
⎥
⎢− 0.0150⎥
⎢ − 0.0683⎥
⎢
⎥
⎢ 0.0415 ⎥
⎢ 0.0492 ⎥
⎢
⎥
⎣⎢ − 0.1535⎦⎥
A more challenging dataset, with initial conditions
x(0)=1.2, x(0) = 0 when t < 0, dt = 1 and τ = 30 (instead of a
generally used τ = 17) [12], [20].In this case x(t-30), x(t-20),
x(t-10) and x(t) are used to predict x(t+10). Fig. 9 shows the
root mean square error (RMSE) achieved vs the number of
membership functions. A comparison between the output of the
obtained fuzzy model and the testing results are shown in Fig.
10.
Then, the output of the fuzzy model can be calculated using
(10), thus
⎡ y1 ⎤ ⎡u1S (x(k)) u1M (x(k)) … u1B(y(k −2))⎤⎡ −0.595⎤
⎥⎢
⎢ 2⎥ ⎢ 2
⎥
2
2
⎢y ⎥ = ⎢uS (x(k)) uM (x(k)) … uB(y(k −2))⎥⎢ 0.0005⎥
⎥⎢
⎢ ⎥ ⎢
⎥
⎥⎢
⎢ L⎥ ⎢ n
⎥
n
n
⎣⎢y ⎦⎥ ⎣⎢uS (x(k)) uM (x(k)) … uB(u(k −2))⎦⎥⎣−0.1535⎦
Y
W
θ
Fig. 9. The root mean square error RMSE vs the number of triangular
membership functions for Example II
Fig. 8. Output of the nonlinear plan (solid curve) and fuzzy model (dotted
curve) in Example I.
Fig. 10. Output of the nonlinear plan (solid curve) and fuzzy model (dotted
curve) in Example II
The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009)
Cincinnati, Ohio, USA - June 14 - 17, 2009
IV. CONCLUSION
A new approach for the development of linguistically
interpretable fuzzy models from data to forecast the behavior of
time-varying dynamic systems was developed in this paper.
The fuzzy identification algorithm proposed uses triangular
membership function with 0.5 interpolations for antecedent
partition avoiding the presence of complex overlapping that
happens in other methods. This approach does not require other
techniques (neural network, genetic algorithm, etc.) for
learning process.
Results shown in examples I and II reveal that the proposed
approach is an effective and promising forecasting tool. It can
capture the system’s dynamic behavior quickly, and track the
system’s characteristics accurately without sacrificing the
fuzzy system interpretability.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
L-X Wang , J.M. Mendel, “Generating fuzzy rules by learning form
examples”, IEEE Transactions System, Man and Cybernetics, vol. 22,
pp. 1414-1427, Nov. 1992
W. Yu, F. Ortiz_Rodriguez and M. Moreno-Armendariz. “Hierarchical
Fuzzy CMAC for Nonlinear System Modeling, IEEE Trans. Fuzzy
Systems, Vol. 16, No. 5, pp. 1302 –1314. Oct. 2008
M. Sugeno, T. Yasukawa, “A fuzzy logic based approach to qualitative
modeling”. Transactions on Fuzzy Systems, vol. 1, No. 1, pp. 7-31. 1993
J. C. Bezdek, Pattern recognition with Fuzzy Objective Function
Algorithms. Ed. Plenum Press. 1987
E. E. Guztafson, W. C. Kessel, Fuzzy Clustering with a Fuzzy
Covariance Matrix. IEEE CDC, San Diego, California, pp. 503 –
516.1979
D. Nauck, R. Kruse, "Nefclass - a neuro-fuzzy approach for the
classification of data", In Proceedings of the Symposium on Applied
Computing, 1995
D. Nauck, D., R. Kruse, “Neuro-fuzzy systems for function
approximation”. Fuzzy Sets and System. 101(2), pp. 261-271. Jan. 1999
R. P. Paiva., A. Dourado, Interpretability and Learning in Neuro-Fuzzy
Systems, Fuzzy Sets and System. 147, pp. 17-38. 2004
J. Espinosa, J. Vandewalle. Constructing Fuzzy Models with Linguistic
Integrity from Numerical Data-Afreli Algorithm, IEEE Trans. Fuzzy
Systems, Vol. 8, No. 5, pp. 591 – 600. Oct. 2000
T. Sudkamp, A. Knapp, J. Knapp Model Generation by Domain
Refinement and Rule Reduction., IEEE Trans. on System, Man and
Cybernetics, Vol. 33, No.1, Feb., 2003
S. Marsili_Libelli, Fuzzy Prediction of Algal Blooms in the Orbetello
Lagoon. Environmental Modelling & Software, Vol. 19, pp. 799-8008,
2004
W. Wang and J. Vrbanek. “An Evolving Fuzzy Predictor for Industrial
Applications”, IEEE Trans. Fuzzy Systems, vol. 16, No. 6, pp.14391449. 2008
W. Stach, L. Kurgan and W. Pedrycz. “Numerical and Linguistic
Prediction of Time Series with the Use of Fuzzy Cognitive Maps”, IEEE
Trans. Fuzzy Systems, Vol. 16, No. 1, pp.61-72. 2008
Liu, X., Kwan, B. K. and Foo, S. Y. “Time Series Prediction Based on
Fuzzy Principles”, Preprint, Department of Electrical and Computer
Engineering, Florida State University. 2003.
J. Contreras, R. Misa, L. Murillo. “Obtención de Modelos Borrosos
Interpretables de Procesos Dinámicos”. RIAI: Revista Iberoamericana
de Automática e Informática Industrial, vol. 5, No. 3, pp. 70-77. Jul.
2008
J. Contreras, R. Misa, L. Murillo. “Interpretable Fuzzy Models from
Data and Adaptive Fuzzy Control: A New Approach”. IEEE
International Conference on Fuzzy Systems. IEEE Computational
Intelligence Society. Pags.: 1591-1596. Jul. 2007
[17] Ch-F. Juang. “A TSK Type Recurrent Fuzzy Network for Dynamic
Systems Processing by Neural Network and Genetic Algorithm”, IEEE
Trans. Fuzzy Systems, Vol. 10, No. 2, pp. 155 –170. Apr. 2002
[18] Y.Chae, K. Oh, W. Lee and G. Kang. “Transformation of TSK fuzzy
system into fuzzy system with singleton consequents and its
application”. IEEE International Conference on Fuzzy Systems. IEEE
Computational Intelligence Society, Vol. 2,. pp.: 969-973. 1999
[19] Pedriycz, W. Why Triangular Membership Functions?”, IEEE Trans.
Fuzzy Sets and System, vol. 64, pp.21-30, 1994
[20] A. H. Meghdadi, M-R. Akbarzadeh-T. “Fuzzy Modeling of Nonlinear
Stochastic System by Learning from Example”. 9th IFSA World
Congress and 20th NAFIPS International Conference. Pp.: 2746-2751.
Jul. 2001