10 1016@j Cogsys 2019 06 003
10 1016@j Cogsys 2019 06 003
10 1016@j Cogsys 2019 06 003
com
ScienceDirect
Cognitive Systems Research 58 (2019) 173–194
www.elsevier.com/locate/cogsys
Received 31 December 2018; received in revised form 2 May 2019; accepted 13 June 2019
Available online 21 June 2019
Abstract
Multi-Layer Perceptron (MLP) is among the most widely applied Artificial Neural Networks (ANNs). Multi-Layer Perceptron
(MLP) requires specific designing and training depending upon specific applications. This paper deals with the high-dimensional problem
of classification of human glioma from Molecular Human Brain Neoplasia Data by designing a Multi-Layer Perceptron (MLP) which is
trained through hybridizing Particle Swarm Optimization (PSO) and Genetic Algorithm (GA). The results are compared with individual
algorithms in terms of convergence rate, Mean Squared Error (MSE) and classification accuracy.
Ó 2019 Elsevier B.V. All rights reserved.
1. Introduction Jain, & Mikkelsen, 2015), which stores the data regarding
the types and grades of human glioma. It links radiological
Lack of recorded molecular data for a large sample and phenotype to tissue genotype through clinical information
inadequate biomedical data integration from various and genomic characterization data. Primary objective of
sources hamper the process of finding better treatments the paper is designing a Multi-Layer Perceptron (MLP)
of brain tumors. Hence, the task of data integration, redis- to classify human glioma from REMBRANDT.
tribution and analysis both across and within functional The aim of training MLP is to obtain optimal weight-
domains is a very important aspect in the field of biomed- bias combination for attaining minimum training and test-
ical research. Each tumor has a special genomic signature ing error. In Machine Learning, the MLP is generally
which varies from patient to patient as well as from tumor trained through a gradient based supervised learning tech-
to tumor. Hence, the development of individualized nique - Backpropagation (BP). There are, however, some
treatment based on these signatures need advanced disadvantages in training MLP through BP like slow con-
biomedical informatics infrastructure. One such effort is vergence (Fahlman, 1988; Vogl, Mangis, Rigler, Zink, &
‘‘The Repository of Molecular Brain Neoplasia Data Alkon, 1988) and the local minima entrapment tendency
(REMBRANDT)” (Clark, 2013; Scarpace, Lisa, Adam, (Gori & Tesi, 1992; Lee, Oh, & Kim, 1993). Also, the con-
vergence of BP greatly depends on the initial parameter
values. Unsuitable initial values of parameters sometimes
⇑ Corresponding author.
lead to divergence instead of convergence. It has also been
E-mail addresses: kbhattacharjee@as.iitr.ac.in (K. Bhattacharjee),
millifpt@iitr.ac.in (M. Pant). pointed out that BP is more suitable for simple datasets but
https://doi.org/10.1016/j.cogsys.2019.06.003
1389-0417/Ó 2019 Elsevier B.V. All rights reserved.
174 K. Bhattacharjee, M. Pant / Cognitive Systems Research 58 (2019) 173–194
its suitability deteriorates with increasing search space Pepyne, 2002). This indicates that there is a possibility of
complexity (Mirjalili, Mirjalili, & Lewis, 2014). In the further improvement in the existing algorithms which is
problem of classifying human glioma from Molecular the main motivation behind the present work.
Brain Neoplasia Data, the search space is high- In case of MLP, the focus of the present work, the
dimensional and consequently the computation becomes search space for training is quite complex and is different
complex. These shortcomings make BP less attractive for for different datasets and requires a robust algorithm that
practical applications concerning complex data sets. can deal efficiently for different kinds of data sets. In this
An alternative to gradient based learning algorithms can study, hybridization of two most widely used and most suc-
be metaheuristic optimization methods. For researchers cessful metaheuristic techniques in training MLP have been
working in this field, it has been a focus of attraction. exploited – Genetic Algorithm (GA) and Particle Swarm
Unlike gradient based techniques, the stochastic nature of Optimization (PSO). GA and PSO are hybridized in five
metaheuristics helps in evading local minima. Literature different ways and the resulting algorithms are tested and
shows the applications of several metaheuristics employed analysed on different benchmarked datasets and are finally
for training MLP – Genetic Algorithm (GA) (Seiffert, employed on the Molecular Brain Neoplasia Data.
2001), Particle Swarm Optimization (PSO) (Mendes, The paper is divided in 11 sections. Section 2 explains
Cortez, Rocha, & Neves, 2002; Rashid & Baig, 2010; Yu, functionality of MLP and how particles/individuals are
Wang, & Xi, 2008), Differential Evolution (DE) (Slowik represented for the problem of training MLP. Section 3
& Bialko, 2008), Ant Colony Optimization (ACO) (Blum reviews various works on hybrid PSO-GA. Section 4–6
& Socha, 2005), Artificial Bee Colony (ABC) (Bullinaria describes the GA, PSO and hybrid PSOGA algorithms
& AlYahya, 2014), Evolution Strategy (Wienholt, 1993) respectively. Section 7 gives a detailed description of the
are some examples of metaheuristics that have been applied Molecular Brain Neoplasia Data. Section 8 deals with
successfully to train an MLP. designing the MLP for classification. Section 9 explains
A review of aforementioned algorithms infers that GA the various metrics used for performance measurement in
has been the most popular metaheuristics algorithm for this paper. Section 10 discusses the experimental results
training MLP. One of the possible reasons for its popular- and Section 11 concludes the paper.
ity could be the fact that GA has been in existence since a
long time and has been perhaps one of the most well stud- 2. Multi-Layer Perceptron (MLP)
ied algorithms of modern times. It has undergone several
modifications and variations to enhance its performance MLP is a class of feedforward Artificial Neural Network
and has been successfully applied to various problems. (ANN) having a minimum of at least three layers. Basic
The basic operators of GA like crossover and mutation structure of MLP is depicted in Fig. 1. Here, n, h and m
are particularly designed for enhancing the exploration represents the numbers of input, hidden and output nodes
capability of GA significantly by causing abrupt changes respectively. In this work, all the weights of an MLP are
in the candidate solutions and thereby try to avoid the local stored in the matrix W and all the biases are stored in
minima. Besides GA, the other recent metaheuristics algo- the matrix B. The MLP output calculation is given below.
rithms that have shown good results are PSO and ACO. First, the weighted sums of inputs are calculated –
Both the algorithms (PSO and ACO) have certain shortcom-
X
n
ings, which make them more susceptible of getting trapped sj ¼ ðW ððj 1Þn þ iÞ I i Þ þ BðjÞ; j ¼ 1; 2; ; h ð1Þ
into a local minima. Initial swarm distribution affects the i¼1
performance of PSO and the main concept of PSO is based
on interaction among the swarm members. When most of W ððj 1Þn þ iÞ is weight from ith input node to jth
the particles are trapped in a local minima, then there is very hidden node
small chance of preventing rest of the particles from being BðjÞ is bias of jth hidden node
trapped in the same minima. Use of pheromone matrix for
reinforcement learning and exploitation in ACO increases I i is the ith input
the tendency of getting trapped in local minima. Here the weights and biases are presented in the equa-
Another metaheuristics algorithm that has been widely tion in matrix form i.e. how they are stored in a matrix.
used for optimization purpose is the Differential Evolution Then output at each hidden node is given by–
(DE). However, it was pointed out by Piotrowski 1
(Piotrowski, 2014) through a number of experiments that S j ¼ sigmoid sj ¼ ; j ¼ 1; 2; ; h ð2Þ
DE is probably not suitable for training Neural Networks ð1 þ expðsj ÞÞ
as it suffers from the problem of stagnation, probably due Lastly, final outputs are calculated –
to loss of diversity.
Nevertheless, many instances are available in literature, X
h
ok ¼ ðW ðnh þ ðk 1Þh þ jÞ S j Þ þ Bðh þ kÞ;
which shows that no metaheuristics algorithm can work j¼1
equally good for all types of optimization problems (No
Free Lunch Theorem) (Wolpert & Macready, 1997; Ho & k ¼ 1; 2; ; m ð3Þ
K. Bhattacharjee, M. Pant / Cognitive Systems Research 58 (2019) 173–194 175
features. As the number of features in a problem increases, 2006) respectively. Third one is proposed in the paper
the MLP gets bigger, subsequently making the search space through incorporating mutation with PSO and then using
more complex. elitist method for enhancing the evolutionary performance.
Yu, Wei, and Wang (2012) generated a model for Energy
3. Literature review Demand Estimation based on PSO and GA (PSO-GA
EDE) for China where PSO processes the population over
Hybridization of PSO and GA has found application for a specified number of generations and then the best N par-
solving various problems. The hybridization technique var- ticles are chosen, discarding the remaining M particles
ies from problem to problem. In this section, various tech- (Population size = N + M). Then, GA generates M new
niques of PSO-GA hybridization are reviewed mainly individuals from best N particles. Lastly, a new population
focusing on the process of hybridization and not on the is generated by combining these M new individuals and N
application for which it is used. best particles which is used in the next generation. Abdel-
Robinson, Sinton, and Rahmat-Samii (2002) investi- Kader (2011) used PSO-GA hybrid algorithm for solving
gated the possibility of hybridizing PSO and GA to opti- the Quality of Service (QoS) multicast routing problem.
mize the design of a profiled corrugated horn antenna. Here, PSO and GA are applied in series along with a
They proposed two approaches - GA-PSO and PSO-GA. replacement termw 2 [0:1]. First, PSO is applied to current
In GA-PSO approach, GA is applied till improvement in population and then the best (population_size*(1 -w)) par-
objective function evaluation started to level off and then ticles are included in the new population. Next, replacement
the GA output is used as the input to PSO. In case of term is used to determine the appropriate number of indi-
PSO-GA, PSO is applied first followed by GA. Shi, Lu, viduals, GA is applied on these individuals and rest of
Zhou, Lee, Lin, and Liang (2003) used two techniques to the population is filled by the output of GA. For solving
integrate PSO and GA - in parallel (PGPHEA) and in ser- non-linear optimization problems, hybridized PSO and
ies (PGSHEA). Juang (2004) proposed a Hybrid GA-PSO GA in series is applied along with dynamic constriction
(HGAPSO) for optimization of recurrent neural network factor to maintain feasibility of the particles by Abd-El-
where the initial population is divided into two halves. Wahed, Mousa, and El-Shorbagy (2011). Kuo, Syu,
The better half is given as input to PSO and the rest is dis- Chen, and Tien (2012) introduce a dynamic clustering tech-
carded. The output of PSO is given as input to GA. And nique based on hybrid PSO-GA (DCPG) where binary
finally, the outputs of PSO and GA are combined to gener- PSO is used along with crossover and mutation operators
ate new population for next generation. Li, Zhao, Guo, of GA to perform mating between personal best and global
and Teng (2006) proposed PHPSO-GA, a parallel best solutions of PSO and then mutate the global best solu-
hybridization of PSO and GA for packing and layout tion. Sheikhalishahi, Ebrahimipour, Shiri, Zaman, and
design problems where adaptive crossover and mutation Jeihoonian (2013) proposed three ways to hybridize PSO
operators of GA are applied to divide the population into and GA for solving the Reliability Redundancy Allocation
various classes. Then different PSO update operators are Problem (RRAP) - series, series-parallel and complex
employed according to the nature of different classes. Du, (bridge) system. Utkarsh, Kantha, Praveen, and Kumar
Li, and Cao (2006) proposed a learning algorithm for (2015) used GA and PSO in series combination to train
ANN based on GA-PSO. Here, candidate solutions are Functional Link ANN which is used as a channel equalizer
generated through crossover and mutation along with in adaptive signal processing. For solving constrained opti-
PSO, based on redefined local optimization swarm. It has mization problems, Garg (2016) proposed a hybrid PSO-
good global search capacity as well as local minima avoid- GA algorithm where solution is obtained via PSO and then
ing capability. Kao and Zahara (2008) presented an inte- GA operators are applied on this solution for exploration
grated GA-PSO for global optimization of multimodal of the search space. Ali and Tawhid (2017) used PSO-GA
function. The population is divided into best and worst for minimizing molecular potential energy function by first
half. Best half is fed to GA. Worst half and the output of applying PSO, then dimensionality reduction and popula-
GA are fed to PSO. Then the outputs of GA and PSO tion partitioning through arithmetic crossover and lastly
are combined to generate new population for next genera- employing mutation operation of GA to avoid premature
tion. Premalatha and Natarajan (2009) used three different convergence. Asadnia, Khorasani, and Warkiani (2017)
techniques to hybridize GA and PSO for global maximiza- used PSO and GA in series combination to train ANN
tion - parallel, series and a technique where each particle in for determining growth parameters of Carbon Nano Tube
PSO change their best positions through mutation operator (CNT). Semero, Zhang, Zheng, and Wei (2018) used a par-
of GA. Marinakis and Marinaki (2010) used Genetic-PSO allel technique to combine PSO and GA to optimize a
method for solving vehicle routing problem where in each multi-layered feed-forward ANN model for wind power
generation GA is applied on the population followed by generation forecasting. Here, PSO and GA are executed
PSO. Kuo and Han (2011) presented three hybrid PSO- simultaneously and after each iteration best solutions of
GA methods to solve bi-level linear programming problem PSO and GA are compared and the solution which is better
- HGAPSO-1, HGAPSO-2 and HGAPSO-3. The first two is used in the next generation for both the algorithms.
methods are based on (Kao & Zahara, 2008) and (Du et al., Anand, Suganthi, Anand, and Suganthi (2018) used
K. Bhattacharjee, M. Pant / Cognitive Systems Research 58 (2019) 173–194 177
ANN-GA-PSO model for forecasting electricity demand in vector. Initially the positions and velocities are generated
Tamil Nadu. Here, PSO and GA are integrated in a simple randomly within the specified ranges. An individual’s per-
series combination where the result of PSO is fed to GA. formance depends not only on its own experience (pbest)
but also it learns from the behaviour of other particles
4. Genetic algorithm (gbest).
Velocity is updated using the following formula -
Holland (1992) introduced Genetic Algorithm (GA) in
vi ðt þ 1Þ ¼ w vji ðtÞ þ r1 c1 pbestji ðtÞ xji ðtÞ
j
1992. GA is a technique based on biological evolution pro-
cess for solving complex optimization problems. Selection, þ r2 c2 gbestj ðtÞ xji ðtÞ ð6Þ
Crossover and Mutation are three main operations of GA. The new position of the particle is given as follows -
In GA, parents are selected from the individuals and chil-
dren are produced through crossover and mutation that xji ðt þ 1Þ ¼ xji ðtÞ þ vji ðt þ 1Þ ð7Þ
will be individuals for next generation. Over generations,
the population of individuals evolve towards an optimal N ¼ Number of particles in the swarm;
solution. Fig. 2 shows the workflow of GA. Algorithmic i ¼ 1; 2; ::; N; j ¼ 1; 2; :; Dim
steps of GA to train a n h m MLP are given below -
r1 ; r2 2 ½0; 1; t ¼ Current iteration;
Step 1: Randomly initialize a population with each indi- c1 ¼ Cognitive constant; c2 ¼ Social constant;
vidual having ðnh þ mhÞ þ ðh þ mÞ attributes. N = Size w ¼ Inertia weight
of population.
Step 2: Set Crossover type, Selection type, Crossover Fig. 3 shows the workflow of PSO. Algorithmic steps of
Probability, Mutation Probability, Number of maxi- PSO to train a n h m MLP are given below)
mum generations.
Step 3: Calculate MSE for each individual in the popu- Step 1: Randomly initialize a population with each par-
lation through forward pass operations on MLP. ticle having ðnh þ mhÞ þ ðh þ mÞ attributes. Initialize
Step 4: Calculate fitness value for each individual. velocities corresponding to each particle. N = Size of
Step 5: Depending on the fitness values, choose the fit- population.
test half of the population. Step 2: Set Inertia weight, Acceleration constants, Num-
Step 6: Depending on the Crossover type and Crossover ber of maximum generations.
Probability, perform crossover operation between cho- Step 3: Calculate MSE for each particle in the popula-
sen individuals. tion through forward pass operations on MLP.
Step 7: Perform the mutation operation. Step 4: Determine pbest and gbest solutions.
Step 8: Replace the worst half by new children generated Step 5: Update particle velocity using Eq. (6).
through Steps 5, 6 and 7. Step 6: Update particle position using Eq. (7).
Step 9: Check if maximum generations are completed. If Step 7: Calculate MSE for each particle in the popula-
yes, output the best individual of current population as tion through forward pass operations on MLP.
result. If not, go to Step 3. Step 8: Update pbest and gbest solutions.
Step 9: Check if maximum generations are completed.
If yes, output the current gbest as result. If not, go to
5. Particle swarm optimization Step 5.
Fig. 2. Genetic algorithm flowchart. Step 7: Calculate MSE for each particle in the popula-
tion through forward pass operations on MLP.
Step 8: Update pbest and gbest solutions.
Step 9: Check if half of maximum generations are com-
the workflow of SPSOGA. Algorithmic steps of SPSOGA pleted. If yes, go to Step 10. If not, go to Step 5.
to train a n h m MLP are given below - Step 10: Calculate fitness value for each individual.
Step 11: Depending on the fitness values, choose the fit-
Step 1: Randomly initialize a population with each par- test half of the population.
ticle having ðnh þ mhÞ þ ðh þ mÞ attributes. Initialize Step 12: Depending on the Crossover type and Cross-
velocities corresponding to each particle. N = Size of over Probability, perform crossover operation between
population. chosen individuals.
Step 2: Set Inertia weight, Acceleration constants, Step 13: Perform the mutation operation.
Crossover type, Selection type, Crossover Probability, Step 14: Replace the worst half by new children gener-
Mutation Probability, Number of maximum ated through Steps 11, 12 and 13.
generations. Step 15: Check if maximum generations are completed.
Step 3: Calculate MSE for each particle in the popula- If yes, output the best individual of current population
tion through forward pass operations on MLP. as result. If not, go to Step 16.
Step 4: Determine pbest and gbest solutions. Step 16: Calculate MSE for each particle in the popula-
Step 5: Update particle velocity using Eq. (6). tion through forward pass operations on MLP and go to
Step 6: Update particle position using Eq. (7). Step 10.
K. Bhattacharjee, M. Pant / Cognitive Systems Research 58 (2019) 173–194 179
6.5. Hybrid PSO-GA (HPSOGA) PSO are fed to GA. Finally, the results of PSO and GA
are combined to form N particles. Then again N/2 best par-
This hybridization is a combination of series as well as ticles are selected from these N particles for the next gener-
parallel techniques. Here, after initializing the population ation. Fig. 8 shows the workflow of HPSOGA.
with N particles and evaluating fitness for each particle, Algorithmic steps of PPSOGA2 to train a n h m
the best N/2 particles are fed to PSO. Then the result of MLP are given below –
182 K. Bhattacharjee, M. Pant / Cognitive Systems Research 58 (2019) 173–194
d ¼ Number of samples present in the dataset 10. Experimental results and discussion
After executing the algorithm for a certain number of
iterations, the final MSE of the solution obtained by an The five hybrid PSO-GA algorithms described in Sec-
algorithm is recorded for the purpose of comparison. Obvi- tion 6, are compared to PSO, GA, ACO, DE and Back
ously, the lower is the value of MSE, the better is the per- propagation(BP) in this section. First they are tested on 6
formance of the algorithm. benchmark datasets – 3 benchmark function approxima-
tion datasets and 3 benchmark classification datasets. Then
9.2. Classification accuracy they are applied on the high dimensional Molecular Brain
Neoplasia data.
After training an MLP through an algorithm, a testing
dataset is given to the MLP. Classification accuracy is the 10.1. Parameter setting
percentage of data accurately classified by the MLP.
Hence, this is a direct measurement of how well an algo- The population size is 100 for every dataset. Each
rithm trains the MLP. The better it trains, the more accu- candidate in the population is initialized randomly in
rate the classification will be. the range [10,10]. In case of metaheuristic algorithms,
the maximum generation is 200 for the function approx-
9.3. Convergence rate imation datasets and 300 for classification datasets. In
case of Back Propagation (BP), the maximum iteration
Convergence rate indicates how fast and how smoothly is 250. Assumptions and parameter values of each
an algorithm approaches towards the optimal solution. algorithm is presented in a tabulated form in Table 2.
Smoother the curve, the more reliable is the behavior in Tuning the parameters is not within the scope of this
local minima avoidance. paper.
186 K. Bhattacharjee, M. Pant / Cognitive Systems Research 58 (2019) 173–194
Table 6 best results in most of the cases. As datasets get bigger and
XOR dataset results. more complex, the relative performance of the hybrid algo-
Algorithm MSE Classification accuracy (%) rithms compared to individual algorithm gets better. As
PSO 0.066471 50 discussed in Section 7, Molecular brain Neoplasia data is
GA 3.98e-06 100 a high dimensional data with total 2775 parameters to be
ACO 0.060283 62 optimized. From, benchmark dataset results it can be
DE 0.0035 75
SPSOGA 0.061311 80
extrapolated that the hybrid algorithm will outperform
PPSOGA 6.67e-07 100 the individual algorithms. Still all the algorithms are
SGAPSO 0.00043793 100 applied on this dataset for comparison purpose and to
PPSOGA2 3.05e-06 100 observe by which margin the hybrid algorithm outperforms
HPSOGA 0.0060741 91 the individual ones. Table 9 and Fig. 19 show the experi-
BP 0.0697 50
mental results are presented in. It can be observed from
the results that, SGAPSO achieves the best MSE and clas-
Although, it is difficult to specify which hybrid algorithm sification accuracy and fastest convergence followed by
will work better for which type of dataset, SGAPSO and SPSOGA. In case of high dimensionality also, prior use
PPSOGA narrowly passes this generalization as they show of GA to PSO holds its superiority. From Fig. 19, it is evi-
190 K. Bhattacharjee, M. Pant / Cognitive Systems Research 58 (2019) 173–194
Table 7
Iris dataset results. 11. Conclusion
Algorithm MSE Classification accuracy (%)
In this paper five versions of hybrid PSO-GA algorithms
PSO 0.32444 35
GA 0.030057 86
are presented for the training of MLP and for consequently
ACO 0.27444 45 classifying different sets of problems including 6 bench-
DE 0.11741 64 mark datasets (3 benchmark function approximation data-
SPSOGA 0.029044 91 sets - Sigmoid, Sphere, Rastrigin and 3 benchmark
PPSOGA 0.037154 89 classification datasets 3-bit XOR, Iris, Breast Cancer)
SGAPSO 0.074193 75
PPSOGA2 0.026446 92
and Molecular Brain Neoplasia data. The following con-
HPSOGA 0.031465 75 clusions can be drawn from the present study:
BP 0.20324 52
1. It is evident from the numerical results and graphical
interpretations that hybridization of PSO and GA is
dent that though MSE values are not appropriate, the an efficient way for training MLP algorithms. This is
curves of parallel hybridizations – PPSOGA, PPSOGA2 quite obvious because hybridization provides a syner-
and HPSOGA are smoother than others. getic effect on the working of the algorithms.
K. Bhattacharjee, M. Pant / Cognitive Systems Research 58 (2019) 173–194 191
datasets; doing a thorough analysis of the proposed Ali, A. F., & Tawhid, M. A. (2017). A hybrid particle swarm optimization
algorithms; comparing the proposed algorithms with other and genetic algorithm with population partitioning for large scale
optimization problems. Ain Shams Engineering Journal, 8(2), 191–206.
hybrid variants available in literature and so on. Anand, A., Suganthi, L., Anand, A., & Suganthi, L. (2018). Hybrid GA-
PSO optimization of artificial neural network for forecasting electricity
Declaration of Competing Interest demand. Energies, 11(4), 728.
Asadnia, M., Khorasani, A. M., & Warkiani, M. E. (Sep. 2017). An
accurate PSO-GA based neural network to model growth of carbon
The authors declare that there is no conflict of interest nanotubes. Journal of Nanomaterials, 2017, 1–6.
regarding the publication of this article. Blum, C., & Socha, K. (2005). Training feed-forward neural networks with
ant colony optimization: an application to pattern classification. In
Fifth international conference on hybrid intelligent systems (HIS’05),
References p. 6 pp.
Bullinaria, J. A., & AlYahya, K. (2014). Artificial bee colony training of
Abdel-Kader, R. F. (2011). Hybrid discrete PSO with GA operators for neural networks. Cham: Springer (pp. 191–201).
efficient QoS-multicast routing. Ain Shams Engineering Journal, 2(1), Clark, K. et al. (2013). The cancer imaging archive (TCIA): Maintaining
21–31. and operating a public information repository. Journal of Digital
Abd-El-Wahed, W. F., Mousa, A. A., & El-Shorbagy, M. A. (2011). Imaging, 26(6), 1045–1057.
Integrating particle swarm optimization with genetic algorithms for Shiqiang Du, Wanshe Li, & Kai Cao (2006). A learning algorithm of
solving nonlinear optimization problems. Journal of Computational artificial neural network based on GA – PSO. In 2006 6th world
and Applied Mathematics, 235(5), 1446–1453. congress on intelligent control and automation, pp. 3633–3637.
194 K. Bhattacharjee, M. Pant / Cognitive Systems Research 58 (2019) 173–194
Fahlman, S. E. (1988). An empirical study on learning speed in the back Rashid, M., & Baig, A. R. (2010). Improved opposition-based PSO for
propagation, no. 4976. feedforward neural network training. In 2010 international conference
Fisher, R. A. (1936). The use of multiple measurements in taxonomic on information science and applications, pp. 1–6.
problems. Annals of Eugenics, 7(2), 179–188. Robinson, J., Sinton, S., & Rahmat-Samii, Y. (2002). Particle swarm,
Garg, H. (Feb. 2016). A hybrid PSO-GA algorithm for constrained genetic algorithm, and their hybrids: optimization of a profiled
optimization problems. Applied Mathematics and Computation, 274 , corrugated horn antenna. In IEEE antennas and propagation society
292–305. international symposium (IEEE Cat. No.02CH37313), vol. 1, pp. 314–
Gori, M., & Tesi, A. (1992). On the problem of local minima in 317.
backpropagation. IEEE Transactions on Pattern Analysis and Machine Scarpace, D. W., Flanders Lisa, Adam, E., Jain, Rajan, Mikkelsen, Tom,
Intelligence, 14(1), 76–86. & Andrews (2015). Data from REMBRANDT. The cancer imaging
Ho, Y. C., & Pepyne, D. L. (2002). Simple explanation of the no-free- archive.
lunch theorem and its implications. Journal of Optimization Theory and Seiffert, U. (2001). Multiple layer perceptron training using genetic
Applications, 115(3), 549–570. algorithms. Eur. Symp. Artif. Neural Networks ESANN, 159–164.
Holland, J. H. (1992). Genetic Algorithms, 267(1), 66–73. Semero, Y. K., Zhang, J., Zheng, D., & Wei, D. (2018). A GA-PSO hybrid
Juang, C.-F. (2004). A hybrid of genetic algorithm and particle swarm algorithm based neural network modeling technique for short-term
optimization for recurrent network design. IEEE Transactions on wind power forecasting. Distributed Generation and Alternative Energy
Systems, Man, and Cybernetics Part B, 34(2), 997–1006. Journal, 33(4), 26–43.
Kao, Y.-T., & Zahara, E. (2008). A hybrid genetic algorithm and particle Sheikhalishahi, M., Ebrahimipour, V., Shiri, H., Zaman, H., & Jeihoo-
swarm optimization for multimodal functions. Applied Soft Comput- nian, M. (2013). A hybrid GA–PSO approach for reliability optimiza-
ing, 8(2), 849–857. tion in redundancy allocation problem. International Journal of
Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Advanced Manufacturing Technology, 68(1–4), 317–338.
Proceedings of ICNN’95 - international conference on neural net- Shi, X. H., Lu, Y. H., Zhou, C. G., Lee, H. P., Lin, W. Z., & Liang, Y. C.
works, vol. 4, pp. 1942–1948. (2003). Hybrid evolutionary algorithms based on PSO and GA. In The
Kuo, R. J., & Han, Y. S. (Aug. 2011). A hybrid of genetic algorithm and 2003 congress on evolutionary computation, 2003. CEC ’03., vol. 4,
particle swarm optimization for solving bi-level linear programming pp. 2393–2399.
problem – A case study on supply chain model. Applied Mathematical Slowik, A., & Bialko, M. (2008). Training of artificial neural networks
Modelling, 35(8), 3905–3917. using differential evolution algorithm. In 2008 conference on human
Kuo, R. J., Syu, Y. J., Chen, Z.-Y., & Tien, F. C. (2012). Integration of system interactions, pp. 60–65.
particle swarm optimization and genetic algorithm for dynamic Utkarsh, A., Kantha, A. S., Praveen, J., & Kumar, J. R. (2015). Hybrid
clustering. Inf. Sci. (Ny), 195, 124–140. GA-PSO trained functional link artificial neural network based
Lee, Y., Oh, S.-H., & Kim, M. W. (1993). An analysis of premature channel equalizer. In 2015 2nd international conference on signal
saturation in back propagation learning. Neural Networks, 6(5) , processing and integrated networks (SPIN), pp. 285–290.
719–728. Vogl, T. P., Mangis, J. K., Rigler, A. K., Zink, W. T., & Alkon, D. L.
Li, G., Zhao, F., Guo, C., & Teng, H. (2006). Parallel hybrid PSO-GA (1988). Accelerating the convergence of the back-propagation method.
algorithm and its application to layout design. Berlin, Heidelberg: Biological Cybernetics, 59(4–5), 257–263.
Springer (pp. 749–758). Wienholt, W. (1993). Minimizing the system error in feedforward neural
Marinakis, Y., & Marinaki, M. (Mar. 2010). A hybrid genetic – Particle networks with evolution strategy. In ICANN ’93 (pp. 490–493).
swarm optimization algorithm for the vehicle routing problem. Expert London: Springer London.
Systems with Applications, 37(2), 1446–1455. Wolberg, W. H., & Mangasarian, O. L. (1990). Multisurface method of
Mendes, R., Cortez, P., Rocha, M., & Neves, J. (2002). Particle swarms pattern separation for medical diagnosis applied to breast cytology.
for feedforward neural network training. In Proceedings of the 2002 Proceedings of the National Academy of Sciences of the United States of
international joint conference on neural networks. IJCNN’02 (Cat. America, 87(23), 9193–9196.
No.02CH37290), pp. 1895–1899. Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for
Mirjalili, S., Mirjalili, S. M., & Lewis, A. (2014). Let a biogeography- optimization. IEEE Transactions on Evolutionary Computation, 1(1),
based optimizer train your multi-layer perceptron. Inf. Sci. (Ny), 269, 67–82.
188–209. Yu, J., Wang, S., & Xi, L. (2008). Evolving artificial neural networks using
Piotrowski, A. P. (2014). Differential evolution algorithms applied to an improved PSO and DPSO. Neurocomputing, 71(4–6), 1054–1060.
Neural Network training suffer from stagnation. Applied Soft Com- Yu, S., Wei, Y.-M., & Wang, K. (2012). A PSO–GA optimal model to
puting, 21, 382–406. estimate primary energy demand of China. Energy Policy, 42, 329–340.
Premalatha, K., & Natarajan, A. M. (2009). Hybrid PSO and GA for
global maximization. International Journal of Open Problems in
Computer Science and Mathematics, 2(4).