Unit 4 - Soft Computing - WWW - Rgpvnotes.in
Unit 4 - Soft Computing - WWW - Rgpvnotes.in
E
Subject Name: Soft Computing
Subject Code: CS-8001
Semester: 8th
Downloaded from be.rgpvnotes.in
UNIT-4 Notes
In GAs, we have a pool or a population of possible solutions to the given problem. These
solutions then undergo recombination and mutation (like in natural genetics), producing new
children, and the process is repeated over various generations. Each individual (or candidate
solution) is assigned a fitness value (based on its objective function value) and the fitter
individuals are given a higher chance to mate and yield more fitte i di iduals. This is i li e
ith the Da i ia Theo of “u i al of the Fittest .
Genetic Algorithms are sufficiently randomized in nature, but they perform much better than
random local search (in which we just try various random solutions, keeping track of the best so
far), as they exploit historical information as well.
Basic Concepts
Individual: An individual is a single solution. An individual groups together two forms of
solutions as given below:
1. The chromosome which is the raw ge eti i fo atio ge ot pe that the GA deals.
2. The phenotype which is the expressive of the chromosome in the terms of the model.
A chromosome is subdivided into genes.
Genes: Ge es a e the asi i st u tio s fo uildi g a GA. A h o oso e is a se ue e of
genes. Genes may describe a possible solution to a problem, without actually being the
solution. A gene is a bit string of arbitrary lengths. The bit string is a binary representation of
u e of i te als f o a lo e ou d. A ge e is the GA s ep ese tatio of a si gle fa to
value for a control factor, where control factor must have an upper bound and a lower bound.
Fitness: The fitness of an individual in a GA is the value of an objective function for its
phenotype. For calculating fitness, the chromosome has to be first decoded and the objective
function has to be evaluated. The fitness not only indicates how good the solution is, but also
corresponds to how close the chromosome is to the optimal one.
In case of multi criterion optimization, the fitness function is definitely more difficult to
determine. In multi criterion optimization problems, there is often a dilemma as how to
determine if one solution is better than another.
Population: A population is a collection of individuals. A population consist of a number of
individuals being tested, the phenotype parameters defining the individuals and some
information about the search space. The two important aspects of population used in GA are:
1. The initial population generation
2. The population size.
For each and every problem, the population size will depend on the complexity of the problem.
It is often a random initialization of population.
Working Principle
Working principle of a GA is as follows −
We start with an initial population (which may be generated at random or seeded by other
heuristics), select parents from this population for mating. Apply crossover and mutation
operators on the parents to generate new off-springs. And finally these off-springs replace the
existing individuals in the population and the process repeats, see figure 1.1. In this way genetic
algorithms actually try to mimic the human evolution to some extent.
A generalized pseudo-code for a GA is explained in the following program –
Genetic Algorithm()
initialize population
find fitness of population
while (termination criteria is reached) do
parent selection
crossover with probability pc
mutation with probability pm
decode and fitness calculation
survivor selection
find best
return best
Encoding: Encoding is a process of representing individual genes. The process can be performed
using bits, numbers, trees, arrays, lists or any other objects. The encoding depends mainly on
solving the problems.
Types of encoding
One of the most important decisions to make while implementing a genetic algorithm is
deciding the representation that we will use to represent our solutions. It has been observed
that improper representation can lead to poor performance of the GA.
Therefore, choosing a proper representation, having a proper definition of the mappings
between the phenotype and genotype spaces is essential for the success of a GA.
In this section, we present some of the most commonly used representations for genetic
algorithms. However, representation is highly problem specific and the reader might find that
another representation or a mix of the representations mentioned here might suit his/her
problem better.
1. Binary Encoding
This is one of the simplest and most widely used representations in GAs. In this type of
representation the genotype consists of bit strings.
For some problems when the solution space consists of Boolean decision variables – yes or no,
the binary representation is natural. Take for example the 0/1 Knapsack Problem. If there are n
items, we can represent a solution by a binary string of n elements, where the x th element tells
whether the item x is picked (1) or not (0).
2. Real Value Encoding
For problems where we want to define the genes using continuous rather than discrete
variables, the real valued representation is the most natural. The precision of these real valued
or floating point numbers is however limited to the computer.
3. Integer Encoding
Fo dis ete alued ge es, e a ot al a s li it the solutio spa e to i a es o o . Fo
example, if we want to encode the four distances – North, South, East and West, we can
encode them as {0,1,2,3}. In such cases, integer representation is desirable.
4. Permutation Encoding
In many problems, the solution is represented by an order of elements. In such cases
permutation representation is the most suited.
A classic example of this representation is the travelling salesman problem (TSP). In this the
salesman has to take a tour of all the cities, visiting each city exactly once and come back to the
starting city. The total distance of the tour has to be minimized. The solution to this TSP is
naturally an ordering or permutation of all the cities and therefore using a permutation
representation makes sense for this problem.
Fitness Function
The fitness function simply defined is a function which takes a candidate solution to the
problem as input and produces as output ho fit ou ho good the solution is with respect
to the problem in consideration.
Calculation of fitness value is done repeatedly in a GA and therefore it should be sufficiently
fast. A slow computation of the fitness value can adversely affect a GA and make it
exceptionally slow.
In most cases the fitness function and the objective function are the same as the objective is to
either maximize or minimize the given objective function. However, for more complex
problems with multiple objectives and constraints, an Algorithm Designer might choose to have
a different fitness function.
A fit ess fu tio should possess the follo i g ha a te isti s −
The fitness function should be sufficiently fast to compute.
It must quantitatively measure how fit a given solution is or how fit individuals can be
produced from the given solution.
In some cases, calculating the fitness function directly might not be possible due to the inherent
complexities of the problem at hand. In such cases, we do fitness approximation to suit our
needs.
The following image shows the fitness calculation for a solution of the 0/1 Knapsack. It is a
simple fitness function which just sums the profit values of the items being picked (which have
a 1), scanning the elements from left to right till the knapsack is full.
Reproduction / Selection
Fitness Proportionate Selection is one of the most popular ways of parent selection. In this
every individual can become a parent with a probability which is proportional to its fitness.
Therefore, fitter individuals have a higher chance of mating and propagating their features to
the next generation. Therefore, such a selection strategy applies a selection pressure to the
more fit individuals in the population, evolving better individuals over time.
Following are some of the selection strategies:
Tournament Selection
In K-Way tournament selection, we select K individuals from the population at random and
select the best out of these to become a parent. The same process is repeated for selecting the
next parent. Tournament Selection is also extremely popular in literature as it can even work
with negative fitness values.
Rank Selection
Rank Selection also works with negative fitness values and is mostly used when the individuals
in the population have very close fitness values (this happens usually at the end of the run). This
leads to each individual having an almost equal share of the pie (like in case of fitness
proportionate selection) as shown in the following image and hence each individual no matter
how fit relative to each other has an approximately same probability of getting selected as a
parent. This in turn leads to a loss in the selection pressure towards fitter individuals, making
the GA to make poor parent selections in such situations.
In this, we remove the concept of a fitness value while selecting a parent. However, every
individual in the population is ranked according to their fitness. The selection of the parents
depends on the rank of each individual and not the fitness. The higher ranked individuals are
preferred more than the lower ranked ones.
Chromosome Fitness Value Rank
A 8.1 1
B 8.0 4
C 8.05 2
D 7.95 6
E 8.02 3
F 7.99 5
Crossover Operator
Crossover is the process of taking two parent solutions and producing from them a child. After
the selection (reproduction) process, the population is enriched with better individuals.
Reproduction makes clones of good strings but does not create new ones. Crossover operator is
applied to the mating pool with the hope that is creates a better offspring.
Crossover is a recombination operator that proceeds in three steps:
a. The reproduction operator selects at random a pair of two individual strings for the
mating.
b. A cross site is selected at random along the string length.
c. Finally, the position values are swapped between the two strings following the cross
site.
Uniform Crossover
I a u ifo osso e , e do t di ide the h o oso e i to seg e ts, ather we treat each
gene separately. In this, we essentially flip a coin for each chromosome to decide whether or
ot it ll e i luded i the off-spring. We can also bias the coin to one parent, to have more
genetic material in the child from that parent.
Mutation Operator
After a crossover is performed, mutation takes place. This is to prevent falling all solutions in
population into a local optimum of solved problem. Mutation changes randomly the new
offspring. For binary encoding we can switch a few randomly chosen bits from 1 to 0 or from 0
to 1. Mutation can then be following:
Original offspring 1 1101111000011110
Original offspring 2 1101100100110110
Mutated offspring 1 1100111000011110
Mutated offspring 2 1101101100110110
The mutation depends on the encoding as well as the crossover. For example when we are
encoding permutations, mutation could be exchanging two genes.
Bitwise operator
In this bit wise operator, we select one or more random bits and flip them. This is used for binary
encoded GAs.
Generation Cycle:
A Genetic Algorithms operates through a simple cycle of stages:
i C eatio of a ; populatio of st i gs,
ii) Evaluation of each string,
iii) Selection of best strings and
iv) Genetic manipulation to create new population of strings.
Each cycle in Genetic Algorithms produces a new generation of possible solutions for a
given problem. In the first phase, an initial population, describing
representatives of the potential solution, is created to initiate the search
process. The elements of the population are encoded into bit-strings, called chromosomes. The
performance of the strings, often called fitness, is then evaluated with the help of some
functions, representing the constraints of the problem. Depending on the fitness of the
chromosomes, they are selected for a subsequent genetic manipulation process.
It should be notes that the selection process is mainly responsible for assuring survival of the
best-fit individuals. After selection of the population strings is over, the genetic manipulation
process consisting of two steps is carried out. In the first step, the crossover
operation that recombines the bits (genes) of each two selected strings
(chromosomes) is executed.
Convergence of GA
A genetic algorithm is usually said to converge when there is no significant improvement in the
values of fitness of the population from one generation to the next. Examples of stopping
criteria are generally, time limits placed on the GA run, generation limits, or if the algorithms
finds a suitably low fitness individual, lower than a specified fitness threshold ( in case we are
minimizing fitness ). There is no defined difference between stopping criteria and convergence
criteria; the terms can be used interchangeably. Verifying that a GA has converged at global
optima for a NP hard problem is impossible, unless you have a test data set for which the best
solution is already known. The best you can do, is try out multiple runs of the GA, with different
values of Mutation, Crossover probabilities, try out different fitness functions, crossover
operators, and many variants of simple GA such as Elitism, Multi-Objective GA's etc.
The common GA terminating conditions are:
When fixed number of generations are reached
A optimal solution is obtained that satisfies the optimization criteria
When successive GA iterations no longer produce better results
Allocated budget (computational time / cost) reached
Application of GA
Genetic Algorithms are primarily used in optimization problems of various kinds, but they are
frequently used in other application areas as well.
Following are some of the areas in which Genetic Algorithms are frequently used:
Optimization − Ge eti Algo ith s a e ost o o l used i opti izatio p o le s
wherein we have to maximize or minimize a given objective function value under a
given set of constraints. The approach to solve Optimization problems has been
highlighted throughout the tutorial.
Economics − GAs a e also used to ha a te ize a ious e o o i odels like the
cobweb model, game theory equilibrium resolution, asset pricing, etc.
Neural Networks − GAs a e also used to t ai eu al et o ks, pa ti ula l e u e t
neural networks.
Parallelization − GAs also ha e e good pa allel apa ilities, a d p o e to e e
effective means in solving certain problems, and also provide a good area for research.
Image Processing − GAs a e used fo a ious digital i age p o essi g DIP tasks as ell
like dense pixel matching.
Vehicle routing problems − With ultiple soft ti e i do s, ultiple depots a d a
heterogeneous fleet.
Scheduling applications − GAs a e used to solve various scheduling problems as well,
particularly the time tabling problem.
Machine Learning − as al ead dis ussed, ge eti s ased a hi e lea i g GBML is a
niche area in machine learning.
Robot Trajectory Generation − GAs ha e ee used to plan the path which a robot arm
takes by moving from one point to another.
Parametric Design of Aircraft − GAs ha e ee used to desig ai afts a i g the
parameters and evolving better solutions.
DNA Analysis − GAs ha e ee used to dete i e the structure of DNA using
spectrometric data about the sample.
Multimodal Optimization − GAs a e o iousl e good app oa hes fo ulti odal
optimization in which we have to find multiple optimum solutions.
Traveling salesman problem and its applications − GAs ha e ee used to sol e the
TSP, which is a well-known combinatorial problem using novel crossover and packing
strategies
Advances in GA: Implementation of GA using MATLAB
MATLAB has a wide variety of functions useful to the genetic algorithm practitioner.Given the
e satilit of MATLAB s high-level language, problems can be codedin m-files in a fraction of the
time that it would take to create C or FORTRAN programs for the same purpose. Couple this
ith MATLAB s ad a ed data a al sis, isualizatio tools and special purpose application
domain toolboxes and the user ispresented with a uniform environment with which to explore
the potential of geneticalgorithms.
The Genetic Algorithm Toolbox uses MATLAB matrix functions to build a setof versatile tools for
implementing a wide range of genetic algorithm methods. TheGenetic Algorithm Toolbox is a
collection of routines, written mostly in m-files,which implement the most important functions
in genetic algorithms.
The main data structures used by MATLAB in the Genetic Algorithmtoolbox are:
• Ch o oso es
• O je ti e fu tio alues
• Fit ess alues