Generating dynamic fuzzy models for prediction problems

Juan Contreras

Generating dynamic fuzzy models for prediction problems

Juan Contreras

2009, NAFIPS 2009 - 2009 Annual Meeting of the North American Fuzzy Information Processing Society

visibility

…

description

6 pages

link

1 file

In this paper we present a new method to generate interpretable fuzzy systems from training data. A fuzzy system is developed for nonlinear systems modeling and for system state forecasting. The antecedent partition uses triangular sets with 0.5 interpolations avoiding the presence of complex overlapping that happens in other methods. Singleton consequents are employed and least square method is used to adjust the consequents. This approach is not a hybrid system and does not employ other techniques, like neural network or genetic algorithm. Two benchmark problems have been used to illustrate our approach: the first one is an input-output NARMAX model, which is one of the most popular models in the neural and fuzzy literature; the second one is the chaotic, nonperiodic and nonconvergence Mackey-Glass series, commonly used to evaluate a time series forecasting scheme.

The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009) Cincinnati, Ohio, USA - June 14 - 17, 2009 Generating Dynamic Fuzzy Models for Prediction Problems Juan Contreras Oscar Acuña Department of Naval Engineering Escuela Naval Almirante Padilla Cartagena, Colombia epcontrerasj@ieee.org Department of Electrical Engineering Universidad Tecnológica de Bolívar Cartagena, Colombia oacuna@unitecnologica.edu.co Abstract— In this paper we present a new method to generate interpretable fuzzy systems from training data. A fuzzy system is developed for nonlinear systems modeling and for system state forecasting. The antecedent partition uses triangular sets with 0.5 interpolations avoiding the presence of complex overlapping that happens in other methods. Singleton consequents are employed and least square method is used to adjust the consequents. This approach is not a hybrid system and does not employ other techniques, like neural network or genetic algorithm. Two benchmark problems have been used to illustrate our approach: the first one is an input-output NARMAX model, which is one of the most popular models in the neural and fuzzy literature; the second one is the chaotic, nonperiodic and nonconvergence Mackey-Glass series, commonly used to evaluate a time series forecasting scheme. Keywords: fuzzy identification; interpretability; dynamic systems I. least squares method; INTRODUCTION One of the first proposals to automatically design a fuzzy system from data was the table look-up écheme [1]. However, when the number of inputs and membership functions are huge, the number of fuzzy rules increases exponentially [2]. Sugeno and Yasukawa [3] proposed a methodology to identify fuzzy model parameters using singleton consequents, but it requires many rules and presents a poor description capacity. Different approaches have been proposed to generate fuzzy models from input-output data [3]-[10], but they typically seek for a good accuracy while interpretability of the fuzzy model is not their first concern. In fuzzy systems, it is necessary that the resulting fuzzy models have some transparency, i.e., that their information be interpretable, so as to permit a deeper understanding of the system under study [8]. Interpretability is defined for at least five criteria [8], [9]: a. Distinguishability. The membership functions should be clearly different and each linguistic label should have semantic meaning. b. Any element from the universe of discourse should belong to at least one of the fuzzy sets. c. Due to the fact that each linguistic label has semantic meaning, at least one of the values in the universe of 978-1-4244-4577-6/09/$25.00 ©2009 IEEE discourse should have a membership degree equal to one. In other words, all the fuzzy sets should be normal. d. The numbers of membership functions should not exceed the limit of 9 distinct terms. e. The number of rules should be limited according to human cognitive issues. Fig. 1 shows three membership functions where a strong overlapping occurs and, because of that, it becomes very difficult to label these membership functions. This occurs frequently when neural network or genetic algorithm is used during the training process. Time series prediction is a problem with a wide range of applications, including energy systems planning, flood forecast, traffic control, stock exchange operations and weather prediction. Accordingly, a number of different prediction approaches have been proposed [2], [11]-[14]. Basically, time series prediction can be considered as a modeling problem. The first step is establishing a mapping between input(s) and output(s). Usually, the mapping is nonlinear and chaotic. After such a mapping is set up, it is used to predict future values based on past and current observations [13]. This paper presents a new approach for the development of linguistically interpretable fuzzy models from data. The approach has been used in system identification [15], [16] but in this time it will be used as a fuzzy predictor. The methodology used in this paper to get the fuzzy model from input and output data is presented in two phases: At the first, the inference error method is presented and then the fuzzy identification algorithm to generate an interpretable fuzzy model; at the second phase, the method is applied to two well known benchmark classics: the first one is an input-output NARMAX model, which is one of the most popular models in the neural and fuzzy literature [2], [17]; the second one is the chaotic, nonperiodic and nonconvergence Mackey-Glass series, commonly used to evaluate a time series forecasting scheme [12]. The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009) Cincinnati, Ohio, USA - June 14 - 17, 2009 same numeric value after a defuzzyfication method has been applied [19]. In addition, the overlapping in 0.5 assures that the supports of the fuzzy sets are different. The fuzzy sets generated by the output variable will be a singleton 2) Distribution of the Membership Functions The triangular fuzzy sets of input variables will be distributed symmetrically at each respective universe 3) Operators For combining the antecedents OWA operators will be used [21]. Fig. 1. No interpretable distribution of membership functions II. 4) Inference Method L ∑y FUZZY IDENTIFICATION APPROACH A. Inference Error A fuzzy rule: “if u is A, then y is B ”, where u and y represent two numeric variables, and A ⊂ U and B ⊂ Y, are two fuzzy input and output sets respectively, defined at the universes U and Y, is equivalent to the equation: u A ( u ) ≤ u B ( y) (1) f ( x (i ) ) = ( ) m j x (i ) j =1 L (5) ( ) ∑ m j x (i ) j =1 where m j ( x (i ) ) = u The inference error ε, is given by 0 u A (u ) ≤ u B ( y) ⎧ ∈≈ ⎨ ⎩u A (u ) − u B ( y ) … u A (u ) > u B ( y ) j j A1 ( x1(i ) ).u j A2 ( x 2(i ) ).….u j An ( xn(i ) ) (6) j (2) is the output grade of the jth rule of a Sugeno fuzzy system, y is the singleton value corresponding to rule j. Because we use fuzzy partition with normalized triangular sets with specific L A fuzzy rule of the kind “If u is A, then y is B” with a null inference error, must fulfill the condition overlapping of 0.5, ∑ m (x( ) ) = number of input variables. i j j =1 Then, (5) can be expressed as u A ( u ) = u B ( y) (3) L ∑ y m (x( ) ) j i j In this approach, the consequent u B ( y) will be a singleton. It is because fuzzy models with singleton consequents can be easily understood and adjusted [18]. If the system has n inputs, it must be represented by rules of the kind “If u1 is A1 and u2 is A2 and …and um is Am, then y is B”, and the generated system must fulfill the condition ((u A1 ( x k ) ∧ (u A 2 ( x k ) ∧ ... ∧ (u A m ( x k )) = u B ( y k ) (4) f ( x (i ) ) = 1) Membership Functions The universe partitioning of the input variables in the learning process will be done with normalized triangular sets with specific overlapping of 0.5. The triangular membership functions allow the reconstruction of the linguistic value at the p when p is the number of input variables. C. Fuzzy Identification Algorithm Given a collection of experimental input and output data {xk, yk}, k =1, ..., N, where xk is the p-dimensional input array x 1k , x k2 , … , x kn and y yk is the one-dimensional output array, the algorithm is defined by the following steps: 1. where ∧ represents a t-norm, or an aggregation operator, of fuzzy logic B. Fuzzy Model Structure (7) j =1 Organization of the N pair set of input – output data with i = 1...N ; k = 1,..., p ,, where x k( i ) , y ( i ) { } (i ) x ∈ ℜ are input arrays and y are output scalars. See Fig. 2. (i ) k 2. p Determination of universe ranges of each variable, according to maximum and minimum values of associated data xk− , xk+ , y − , y + . [ ] [ ] The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009) Cincinnati, Ohio, USA - June 14 - 17, 2009 Fig. 4. Generating consequents Fig. 2. Data organization 3. 4. Distribution of triangular membership functions over each universe. As a general condition the vortex with ownership value one (modal value) falls at the middle of the region covered by the membership function while the other two vortexes, with membership values equal to zero, fall in the middle of the two neighboring regions. See Fig 3 Calculate the position of the modal values from the input variable(s), according to if u ( n ) ( x k(i ) ) = 1 A (8) k ysk( n ) = y[i ] end where ysk(n ) corresponds to the projection over the (i ) output space of data x evaluation of the k-th input variable at the n-th set of the corresponding partition. The output value corresponding to this projection is given by the value of the i-th position of output array y. See Fig. 4. 5. Rule determination. Initially, the number of rules are equal to the number of sets of each input variable multiplied by the number of variables; in other words, n × k . The membership function associated with a consequent will be the antecedent of this rule. Antecedents of rules with the same consequent are merged using OWA operator, thus reducing the number of rules as seen in Fig. 5. 6. Model validation described by (7) using the inference method 7. Parameters adjust, relocating the output singletons using the least squares method. Equation (7) can be expressed in the form: ( ) f x (i ) = L ∑y j w j ( x (i ) ) (8) j =1 where ( ) wj x (i ) ( ) m j x (i ) = = wij p (9) Output values can be represented as Y = Wθ + E , that, in matrix form is given by ⎡ y1 ⎤ ⎡ w11 ⎢ 2⎥ ⎢ 2 ⎢ y ⎥ = ⎢ w1 ⎢ ⎥ ⎢ ⎢ L⎥ ⎢ n ⎢⎣ y ⎥⎦ ⎢⎣ w1 Y w12 w22 w2n W 1 … w1L ⎤ ⎡ y ⎤ ⎡ e1 ⎤ ⎢ ⎥ 2⎥ ⎢ ⎥ … wL2 ⎥ ⎢ y ⎥ ⎢e2 ⎥ + ⎥⎢ ⎥ ⎢ ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ … wLn ⎥⎦ ⎢⎣ y L ⎥⎦ ⎣en ⎦ θ (10) E where E is the approximation error, which should be minimized. Using the quadratic error norm, we have: W TY = (W T W )−1W T Y (11) W TW The solution is valid if and only if (12) rank (W T W ) −1 = dim(θ ) Equation (12) implies that all rules have to receive enough excitation during training but in practice it is not always guaranteed so the application of this method will generate catastrophic results for those rules with low excitation and a significant bias in the rules with sufficient excitation. This problem can be solved using recursive least square (RLS). The algorithm RLS looks like [9] θ= Fig. 3. Triangular sum-1 partition Fig. 5. Generating rules The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009) Cincinnati, Ohio, USA - June 14 - 17, 2009 θ ( k + 1) = θ (k ) + γ (k )[ y (k + 1) − Wk +1θ (k )] where { Wk = w1k , w2k ,… wLk γ (k ) = } P( k )Wk +1 Wk +1 P (k )WkT+1 + 1 P (k + 1) = [1 − γ ( k )Wk +1 ]P (k ) (13) (14) (15) (16) with the initial value P(0) obtained in step 4 of this algorithm. 8. Finish if either the square error measure MSE is not greater than a previously established value or the number of membership functions is more than 9. In any other case, increment by 1 the number of sets in the input variable (the number of partition member) and turn back to step 3. III. RESULTS A. Example I The nonlinear plant to be identified is guided by the following difference equation: y(k + 1) = y(k ) y(k − 1) y(k − 2)u (k − 1)[ y(k − 2) − 1] + u (k ) 1 + y(k − 1) 2 + y(k − 2)2 Fig. 6. The root mean square error RMSE vs the number of triangular membership functions for Example I Because of the limit in the number of pages of this paper a structure of a fuzzy model with 3 membership functions is showed in Fig. 7. It is not difficult to assign linguistic terms to each other membership functions. The labels S, M and B denote, respectively, the linguistic terms “small,” “medium,” and “big”. The RMSE achieved is 0.0398 in this case. With 5 input variables and 3 triangular membership functions for each input variable, 15 consequents (and 15 rules) are generated. (step 5 in fuzzy identification algorithm). The output of the obtained fuzzy model and the testing results are shown in Fig. 8. (17) Output y(k + 1) depends on two previous inputs [u(k), u(k1)] and on three previous outputs [y(k), y(k − 1), y(k − 2)]. The following input is used for test u (k ) = sin(πk / 25), k < 250 250 ≤ k < 500 = 1.0 500 ≤ k < 750 = −1.0 = 0.3 sin(πk / 25) + 0.1sin(πk / 32) 750 ≤ k < 1000 + 0.6 sin(πk / 10) The five input variables u(k), u(k-1), y(k), y(k − 1) and y(k − 2) are fed as input. Fig. 6 shows the root mean square error (RMSE) achieved vs the number of membership functions. With 7 membership functions a RMSE of 0.0247, lower than the RMSE of 0.0265 achieved in [17]. Fig. 7. Partitions of u(k-1), u(k), y(k-2), y(k-1) and y(k) with 3 triangular membership function s The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009) Cincinnati, Ohio, USA - June 14 - 17, 2009 B. Example 2 Now we apply our approach to one of the most commonly used benchmark in system identification: the prediction of the Mackey-Glass time series described by 0 . 2 x (t − τ ) x (t ) = − 0.1x (t ) (9) 1 + x10 (t − τ ) The array of singleton consequents is: ⎡− 0.5950⎤ ⎢ 0.0005 ⎥ ⎢ ⎥ ⎢ 0.5318 ⎥ ⎢ ⎥ ⎢ 0.0677 ⎥ ⎢− 0.0297⎥ ⎢ ⎥ ⎢ − 0.1008⎥ ⎢− 0.5099⎥ ⎢ ⎥ θ (k ) = ⎢ 0.0481 ⎥ ⎢ 0.3990 ⎥ ⎢ ⎥ ⎢ 0.0205 ⎥ ⎢ ⎥ ⎢− 0.0150⎥ ⎢ − 0.0683⎥ ⎢ ⎥ ⎢ 0.0415 ⎥ ⎢ 0.0492 ⎥ ⎢ ⎥ ⎣⎢ − 0.1535⎦⎥ A more challenging dataset, with initial conditions x(0)=1.2, x(0) = 0 when t < 0, dt = 1 and τ = 30 (instead of a generally used τ = 17) [12], [20].In this case x(t-30), x(t-20), x(t-10) and x(t) are used to predict x(t+10). Fig. 9 shows the root mean square error (RMSE) achieved vs the number of membership functions. A comparison between the output of the obtained fuzzy model and the testing results are shown in Fig. 10. Then, the output of the fuzzy model can be calculated using (10), thus ⎡ y1 ⎤ ⎡u1S (x(k)) u1M (x(k)) … u1B(y(k −2))⎤⎡ −0.595⎤ ⎥⎢ ⎢ 2⎥ ⎢ 2 ⎥ 2 2 ⎢y ⎥ = ⎢uS (x(k)) uM (x(k)) … uB(y(k −2))⎥⎢ 0.0005⎥ ⎥⎢ ⎢ ⎥ ⎢ ⎥ ⎥⎢ ⎢ L⎥ ⎢ n ⎥ n n ⎣⎢y ⎦⎥ ⎣⎢uS (x(k)) uM (x(k)) … uB(u(k −2))⎦⎥⎣−0.1535⎦ Y W θ Fig. 9. The root mean square error RMSE vs the number of triangular membership functions for Example II Fig. 8. Output of the nonlinear plan (solid curve) and fuzzy model (dotted curve) in Example I. Fig. 10. Output of the nonlinear plan (solid curve) and fuzzy model (dotted curve) in Example II The 28th North American Fuzzy Information Processing Society Annual Conference (NAFIPS2009) Cincinnati, Ohio, USA - June 14 - 17, 2009 IV. CONCLUSION A new approach for the development of linguistically interpretable fuzzy models from data to forecast the behavior of time-varying dynamic systems was developed in this paper. The fuzzy identification algorithm proposed uses triangular membership function with 0.5 interpolations for antecedent partition avoiding the presence of complex overlapping that happens in other methods. This approach does not require other techniques (neural network, genetic algorithm, etc.) for learning process. Results shown in examples I and II reveal that the proposed approach is an effective and promising forecasting tool. It can capture the system’s dynamic behavior quickly, and track the system’s characteristics accurately without sacrificing the fuzzy system interpretability. REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] L-X Wang , J.M. Mendel, “Generating fuzzy rules by learning form examples”, IEEE Transactions System, Man and Cybernetics, vol. 22, pp. 1414-1427, Nov. 1992 W. Yu, F. Ortiz_Rodriguez and M. Moreno-Armendariz. “Hierarchical Fuzzy CMAC for Nonlinear System Modeling, IEEE Trans. Fuzzy Systems, Vol. 16, No. 5, pp. 1302 –1314. Oct. 2008 M. Sugeno, T. Yasukawa, “A fuzzy logic based approach to qualitative modeling”. Transactions on Fuzzy Systems, vol. 1, No. 1, pp. 7-31. 1993 J. C. Bezdek, Pattern recognition with Fuzzy Objective Function Algorithms. Ed. Plenum Press. 1987 E. E. Guztafson, W. C. Kessel, Fuzzy Clustering with a Fuzzy Covariance Matrix. IEEE CDC, San Diego, California, pp. 503 – 516.1979 D. Nauck, R. Kruse, "Nefclass - a neuro-fuzzy approach for the classification of data", In Proceedings of the Symposium on Applied Computing, 1995 D. Nauck, D., R. Kruse, “Neuro-fuzzy systems for function approximation”. Fuzzy Sets and System. 101(2), pp. 261-271. Jan. 1999 R. P. Paiva., A. Dourado, Interpretability and Learning in Neuro-Fuzzy Systems, Fuzzy Sets and System. 147, pp. 17-38. 2004 J. Espinosa, J. Vandewalle. Constructing Fuzzy Models with Linguistic Integrity from Numerical Data-Afreli Algorithm, IEEE Trans. Fuzzy Systems, Vol. 8, No. 5, pp. 591 – 600. Oct. 2000 T. Sudkamp, A. Knapp, J. Knapp Model Generation by Domain Refinement and Rule Reduction., IEEE Trans. on System, Man and Cybernetics, Vol. 33, No.1, Feb., 2003 S. Marsili_Libelli, Fuzzy Prediction of Algal Blooms in the Orbetello Lagoon. Environmental Modelling & Software, Vol. 19, pp. 799-8008, 2004 W. Wang and J. Vrbanek. “An Evolving Fuzzy Predictor for Industrial Applications”, IEEE Trans. Fuzzy Systems, vol. 16, No. 6, pp.14391449. 2008 W. Stach, L. Kurgan and W. Pedrycz. “Numerical and Linguistic Prediction of Time Series with the Use of Fuzzy Cognitive Maps”, IEEE Trans. Fuzzy Systems, Vol. 16, No. 1, pp.61-72. 2008 Liu, X., Kwan, B. K. and Foo, S. Y. “Time Series Prediction Based on Fuzzy Principles”, Preprint, Department of Electrical and Computer Engineering, Florida State University. 2003. J. Contreras, R. Misa, L. Murillo. “Obtención de Modelos Borrosos Interpretables de Procesos Dinámicos”. RIAI: Revista Iberoamericana de Automática e Informática Industrial, vol. 5, No. 3, pp. 70-77. Jul. 2008 J. Contreras, R. Misa, L. Murillo. “Interpretable Fuzzy Models from Data and Adaptive Fuzzy Control: A New Approach”. IEEE International Conference on Fuzzy Systems. IEEE Computational Intelligence Society. Pags.: 1591-1596. Jul. 2007 [17] Ch-F. Juang. “A TSK Type Recurrent Fuzzy Network for Dynamic Systems Processing by Neural Network and Genetic Algorithm”, IEEE Trans. Fuzzy Systems, Vol. 10, No. 2, pp. 155 –170. Apr. 2002 [18] Y.Chae, K. Oh, W. Lee and G. Kang. “Transformation of TSK fuzzy system into fuzzy system with singleton consequents and its application”. IEEE International Conference on Fuzzy Systems. IEEE Computational Intelligence Society, Vol. 2,. pp.: 969-973. 1999 [19] Pedriycz, W. Why Triangular Membership Functions?”, IEEE Trans. Fuzzy Sets and System, vol. 64, pp.21-30, 1994 [20] A. H. Meghdadi, M-R. Akbarzadeh-T. “Fuzzy Modeling of Nonlinear Stochastic System by Learning from Example”. 9th IFSA World Congress and 20th NAFIPS International Conference. Pp.: 2746-2751. Jul. 2001

Log In

Generating dynamic fuzzy models for prediction problems

Related papers

Related topics

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!