WWWWW Clustering Algorithm
WWWWW Clustering Algorithm
Abstract: Fuzzy C-Means Clustering algorithm (FCM) is a method that is frequently used in pattern recognition. It
has the advantage of giving good modeling results in many cases, although, it is not capable of specifying the
number of clusters by itself. In FCM algorithm most researchers fix weighting exponent (m) to a conventional value
of 2 which might not be the appropriate for all applications. Consequently, the main objective of this paper is to use
the subtractive clustering algorithm to provide the optimal number of clusters needed by FCM algorithm by
optimizing the parameters of the subtractive clustering algorithm by an iterative search approach and then to find an
optimal weighting exponent (m) for the FCM algorithm. In order to get an optimal number of clusters, the iterative
search approach is used to find the optimal single-output Sugeno-type Fuzzy Inference System (FIS) model by
optimizing the parameters of the subtractive clustering algorithm that give minimum least square error between the
actual data and the Sugeno fuzzy model. Once the number of clusters is optimized, then two approaches are
proposed to optimize the weighting exponent (m) in the FCM algorithm, namely, the iterative search approach and
the genetic algorithms. The above mentioned approach is tested on the generated data from the original function and
optimal fuzzy models are obtained with minimum error between the real data and the obtained fuzzy models.
Keywords: Fuzzy c-means clustering, fuzzy inference system, genetic algorithm, model optimization, sugeno
system
Corresponding Author: Mohanad Alata, Mechanical Engineering Department, King Saud University, Saudi Arabia
695
Res. J. Appl. Sci. Eng. Technol., 5(3): 695-701, 2013
points, Beightler et al. (1979). Luus and Jaakola (1973) The subtractive clustering method is an extension
identify three main types of search methods: calculus- of the mountain clustering method proposed by Yager
based, enumerative and random. The work of Hesam and Filev (1994).
and Ajith (2011) shows a hybrid fuzzy clustering The subtractive clustering is used to determine the
method based on FCM and Fuzzy PSO (FPSO) is number of clusters of the data being proposed and then
proposed which make use of the merits of both generates a fuzzy model. However, the iterative search
algorithms. is used to optimize the least square error from the
Hall et al. (1999) describe a genetically guided model being generated and the test model. After that,
approach for optimizing the hard (J 1 ) and fuzzy (J m ) c- the number of clusters is taken to the Fuzzy C-Means
means functional used in cluster analysis. Our Algorithm.
experiments show that a genetic algorithm ameliorates
the difficulty of choosing an initialization for the c- The fuzzy c-means clustering algorithm: Fuzzy C-
means clustering algorithms. Experiments use six data Means (FCM) is a method of clustering which allows
sets, including the Iris data, magnetic resonance and one piece of data to belong to two or more clusters.
color images. The genetic algorithm approach is This method is frequently used in pattern recognition. It
generally able to find the lowest known J m value or a is based on minimization of the following objective
J m associated with a partition very similar to that function:
associated with the lowest J m value. On data sets with
several local extrema, the GA approach always avoids N C
J m = ∑∑ u ijm xi − c j
2
the less desirable solutions. Deteriorate partitions are
always avoided by the GA approach, which provides an i =1 j =1 (1)
effective method for optimizing clustering models
whose objective function can be represented in terms of
cluster centers. The time cost of genetic guided where,
clustering is shown to make series of random m : Any real number greater than 1, it was set to 2.00
initializations of fuzzy/hard c-means, where the by Bezdek (1981)
partition associated with the lowest J m value is chosen u ij : The degree of membership of x i in the cluster j
and an effective competitor for many clustering x i : The ith of d-dimensional measured data
domains. c j : The d-dimension center of the cluster
The main differences between this work and the ||*|| : Any norm expressing the similarity between any
one by Hall et al. (1999) are: measured data and the center
• This study used the least square error as an Fuzzy partitioning is carried out through an
objective function for the genetics algorithm but iterative optimization of the objective function shown
Hall et al. (1999) used J m as an objective function. above, with the update of membership u ij and the c j
• This study optimized the weighting exponent m cluster centers by:
without changing the distance function but Hall
et al. (1999) keeps the weighting exponent 1
u ij =
m = 2.00 and uses two different distance functions 2
• Selects the data point with the highest potential to This iteration will stop when:
be the first cluster center.
• Removes all data points in the vicinity of the first
cluster center (as determined by radii), in order to
{
max ij u ij( k +1 −u ij( k ) < ε } (4)
determine the next data cluster and its center
location. where,
• Iterates on this process until all of the data is within ε : A termination criterion between 0 and 1
radii of a cluster center. k : The iteration steps
696
Res. J. Appl. Sci. Eng. Technol., 5(3): 695-701, 2013
This procedure converges to a local minimum or a The genetics algorithm: The GA is a stochastic global
saddle point of J m . search method that mimics the metaphor of natural
The algorithm is composed of the following steps: biological evolution. GAs operates on a population of
potential solutions applying the principle of survival of
• Initialize U = [u ij ] matrix, U(0) the fittest to produce (hopefully) better and better
• At k-step: calculate the centers vectors C(k) = [c j ] approximations to a solution given by Ginat (1988),
with U(k) Wang (1997) and Sinha et al. (2010). At each
generation, a new set of approximations is created by
N the process of selecting individuals according to their
∑u m
ij .xi level of fitness in the problem domain and breeding
c = i =1
N
them together using operators borrowed from natural
Original data
Testing data
Subtractive clustering model
Original
function results
Least squares
error
Change
lest squares
parameters
Store results:
centers
To
Testing data Original data
FCM
Original
function results FCM clustering model
Fig. 2: Random data points of Eq. (7); blue circles for the data to be clustered and the red stares for the testing data
output from the original function by entering a tested (0.0126 with 53 clusters). Next, the optimized least
data. The optimizing is carried out by iteration. After square error of the subtractive clustering is obtained by
that, the genetic algorithms optimized the weighting iteration that is (0.0115 with 52 clusters). We could see
exponent of FCM. The same way, build the fuzzy here that the error improves by (10%). Then, the
model using FCM then optimize the weighting clusters number is taken to the FCM algorithm, the
exponent m by optimizing the least square error error is optimized to (0.004 with 52 clusters) that means
between the output of the fuzzy model and the output the error improves by (310%) and the weighting
from the original function by entering the same tested exponent (m) is (1.4149). Results were better shown in
data. Figure 1 shows the flow chart of the program. Table 1.
The best way to introduce results is through
presenting four examples of modeling of four highly Example 2-modeling a one input nonlinear function:
nonlinear functions. Each example is discussed, plotted. In this example, a nonlinear function was proposed also
Then compared with the best error of original FCM but with one variable x:
with weighting exponent (m = 2.00).
sin( x)
Example 1-modeling a two input nonlinear function: y=
In this example, a nonlinear function was proposed: x (8)
z=
sin( x) sin( y )
* The range X ∈ [-20.5, 20.5] is the input space of
x y (7) the above equation, 200 data pairs were obtained
randomly and shown in Fig. 3.
The range X ∈ [-10.5, 10.5] and Y ∈ [-10.5. First, the best least square error is obtained for the
10.5] is the input space of the above equation, 200 data FCM of weighting exponent (m = 2.00) which is
pairs were obtained randomly (Fig. 2). (5.1898*e-7 with 178 clusters). Next, the least square
First, the best least square error was obtained for error of the subtractive clustering is obtained by
the FCM of weighting exponent (m = 2.00) which is iteration which was (1*e -10 with 24) clusters since this
698
Res. J. Appl. Sci. Eng. Technol., 5(3): 695-701, 2013
error pre-defined if the error is less than (1*e-10). Then, weighting exponent (m) is (1.7075). The improvement
the clusters number is taken to the FCM algorithm, the is (4*e7) % and the number of clusters improved by
error is ( 1.27 75*e -12 ) with 24 clusters and the 741%. Results were better shown in Table 2.
699
Res. J. Appl. Sci. Eng. Technol., 5(3): 695-701, 2013
Table 4: The final least square errors and clusters number for the original FCM and for the FCM which their numbers of clusters were got from
the iteratively or genetically optimized subtractive clustering
The original FCM (m = 2) Iteration then genetics
-------------------------------------------------------- -----------------------------------------------------------------------
The function Error Clusters Error Clusters New (m)
Equation (7) 0.0126 53 0.004 52 1.4149
Equation (8) 5.1898 e-7 178 1.2775 e-10 24 1.7075
Equation (9) 3.3583 e-18 188 2.2819 e-18 103 100.8656
Example 3-modeling a one input nonlinear function: for GA, it depends on the length of the individual and
In this example, a nonlinear function was proposed: the range of the parameter which is 0.00003 for the
radius parameter also. So GA gives better performance
y = ( x − 3) − 10 and has less approximation error with less time.
3
(9) Also it can be concluded that the time needed for
the GA to optimize an objective function depends on
The range X ∈ [1, 50] is the input space of the the number and the length of the individual in the
above equation, 200 data pairs were obtained randomly population and the number of parameter to be
and are shown in Fig. 4. optimized.
First, the best least square error is obtained for the
FCM of weighting exponent (m = 2.00) which is ACKNOWLEDGMENT
(3.3583*e-17 with 188 clusters). Next, the least square
error of the subtractive clustering is obtained by The Authors extend their appreciation to the
iteration which is (1.6988*e-17 with 103 clusters) since Deanship of Scientific Research at King Saud
the least error can be taken from the iteration. Then, the University for funding the work through the research
clusters number is taken to the FCM algorithm, the group project No. RGP-VPP-036.
error was (2.2819*e-18 with 103 clusters) and the
weighting exponent (m) is 100.8656. Here we could see REFERENCES
that the number of clusters is reduced from 188 to 103
clusters that mean the number of rules is reduced and Amiya, H. and P. Nilavra, 2011. An evolutionary
the error is improved by 14 times. Results were better dynamic clustering based colour image
shown in Table 3. segmentation. Int. J. Image Process., 4(6): 549-556.
The whole results were better shown in Table 4. Beightler, C.S., D.J. Phillips and D.J. Wild, 1979.
Foundations of Optimization. 2nd Edn., Prentice-
CONCLUSION Hall, Englewood, pp: 487, ISBN: 0133303322.
Bezdek, J.C., 1981. Pattern Recognition with Fuzzy
In this study, the subtractive clustering parameters, Objective Function Algorithms. Plenum Press,
which are the radius, squash factor, accept ratio and the New York, pp: 256.
reject ratio are optimized using the GA. Dipak, K. and H. Amiya, 2010. An efficient dynamic
The original FCM proposed by Bezdek (1981) is image segmentation algorithm using dynamic GA
optimized using GA and another values of the based clustering. Int. J. Logist. Supp. Chain
weighting exponent rather than (m = 2) are giving less Manage., 2(1): 17-20.
approximation error. Therefore, the least square error is Dunn, J.C., 1973. A Fuzzy relative of the ISODATA
enhanced in most of the cases handled in this work. process and its use in detecting compact well-
Also, the number of clusters is reduced. separated clusters. J. Cybernet., 3(3): 32-57.
The time needed to reach an optimum through GA Ginat, D., 1988. Genetic algorithm - a function
is less than the time needed by the iterative approach. optimizer. NSF Report, Department of Computer
Also GA provides higher resolution capability Science, University of Maryland, College Park,
compared to the iterative search due to the fact that the MD20740.
precision depends on the step value in the “for loop Hall, L.O., I.B. Ozyurt and J.C. Bezdek, 1999.
function” which is max equal to 0.001 for the radius Clustering with a genetically optimized approach.
parameter in the subtractive clustering algorithm, but IEEE T. Evol. Comput., 3(2): 103-112.
700
Res. J. Appl. Sci. Eng. Technol., 5(3): 695-701, 2013
Hesam, I. and A. Ajith, 2011. Fuzzy C-means and fuzzy Wang, Q.J., 1997. Using genetic algorithms to optimize
swarm for fuzzy clustering problem. Exp. Syst. model parameters. J. Env. Mod. Software, 12(1).
Appl., 38: 1835-1838. Yager, R. and D. Filev, 1994. Generation of fuzzy rules
Li-Xin, 1997. A Course in Fuzzy Systems and Control. by mountain clustering. J. Intell. Fuzzy Syst., 2(3):
Prentice Hall, Upper Saddle River, pp: 424, ISBN:
209-219.
0135408822.
Luus, R. and T.H.I. Jaakola, 1973. Optimization by Zhang, Y., W. Wang, X. Zhang and Y. Li, 2008. A
direct search and systematic reduction of the size cluster validity index for fuzzy clustering. Inform.
of search region. Am. Inst. Chem. Eng. J., 19(4): Sci., 178(4): 1205-1218.
760-766. Zhang, K.S., B. Li, J. Xu and L. Wu, 2009. New
Sinha, S.K., R.N. Patel and R. Prasad, 2010. modification of fuzzy c-means clustering
Application of GA and PSO tuned fuzzy controller algorithm. Fuzzy Inform. Eng., 54: 448-455.
for AGC of three area thermal-thermal-hydro
power system. Int. J. Comput. Theory Eng., 2(2):
1793-8201.
701