Fundamentals of Neural Networks
Fundamentals of Neural Networks
Fundamentals of Neural Networks
NETWORKS
Dr.(Mrs.)Lini Mathew
Associate Professor
Electrical Engineering Department
Objective
To emulate or simulate the human brain.
The Neuron
Fundamental
building block of the
nervous system
Performs all the
computational and
communication
functions within the
brain
A many inputs/ one
output unit
The Neuron
Consists of three
sections
cell body
dendrites
axon
Cell body
manufactures a wide variety of complex
molecules, to keep it renewed for a life
time
manages the energy economy of the
neuron
the outer membrane of the cell body
generates nerve impulses.
Cell body is 5 to 100 microns in dia
Dendrites
Dendrites
Neurotransmitters which are specialized
chemicals are released by the axon, into
the synaptic cleft, diffuse across to the
dendrite.
Neurotransmitters are excitatory, which
tend to produce an output pulse.
Some are inhibitory, which tend to
suppress such a pulse
More than thirty neurotransmitters .
The Axon
may be as short as 0.1mm or it is 1m in
length.
has multiple branches each terminating in a
synapse.
Axons are wrapped in Schwann cells,
forming an insulating sheath known as
myelin.
This myelin sheath is interrupted every
millimeter or so, at narrow gaps called the
nodes of Ranveir.
The Axon
Nerve impulses which pass down the axon,
jump from node to node, thus saving energy.
Complex
Simple
High Speed
Low Speed
One or Few
A Large Number
Separate from a
Processor
Localized
Distributed
Centralized
Distributed
Sequential
Parallel
Stored Programs
Self-Learning
Reliability
Very Vulnerable
Robust
Expertise
Perceptual Problems
Operating
Environment
Well-Defined
Poorly Defined
Well Constrained
Unconstrained
Processor
Memory
Computing
Activation Functions
Threshold Function
S=1
S=0
S
+1
if X > 0
if X 0
S = hardlim(X)
-1
Signum Function
S
+1
S=1
if X > 0
0
S = -1
if X 0
-1
S = hardlims(X)
symmetric hard-limit transfer function
Activation Functions
Squashing Function or Logistic Function or Sigmoidal Function.
1
S
aX
1 e
S=logsig(X)
X=0
X>0
X<0
S = 0.5
S=1
S=0
Activation Functions
Hyperbolic Tangent Function.
S = tanh(X)
X=0
X>0
X<0
S=tansig(X)
S=0
S=1
S = -1
tan-sigmoid transfer function
+1
S = purelin(X)
-1
S
+1
0
-1
S
+1
-1
+1
-1
Symmetric Saturating Linear Transfer
Function
S = satlins(X)
S
+1
-1
0
-1
+1
Training
Training is accomplished by sequentially applying input vectors while
adjusting network weights according to a predetermined procedures.
Supervised Training
requires the pairing of each input vector with a target vector
representing the desired output.
Unsupervised Training
requires no target vector for the output and no comparisons to
predetermined ideal responses. The training algorithm modifies
network weights to produce output vectors that are consistent. Also
called self-organizing networks.
Reinforcement Training
No target. The network is presented with an indication of whether the
output is right or wrong, with which it will improve its performance.
Supervised Learning
(Error Based)
Error Correction
Gradient Descent
Stochastic
Back Propagation
Unsupervised Learning
Hebbian
Competitive
Learning Rules
A neural network learns about its environment through
an interactive process of adjustments applied to its
synaptic weights and bias levels.
The set of well defined rules for the solution of a learning
problem is called a learning algorithm
Hebbian Learning Rule. Oldest and most famous of all
learning rules, designed by Donald Hebb in 1949.
Represents a purely feed-forward, unsupervised learning
If the cross product of output and input is positive, this
results in increase of weights, otherwise the weight
decreases.
The weights are adjusted as Wij (k+1) = Wij (k) + xi y
Perceptron
Perceptron
Supervised Learning Algorithm - Weights are
adjusted to minimize error whenever the
computed output does not match the target
output.
Netj = xi Wij
yj = f(Netj)
ie. yj = 1 if Netj > 0
= 0 otherwise
Weight Adjustment: (i) if the output is 1, but
should have been 0, then Wij (k+1) = Wij (k) - xi
(ii) Otherwise Wij (k+1) = Wij (k) + xi
Successful only for linearly separable problems
Linear Separability
Netj = xi wi + b = x1 w1 + x2 w2 + b
The relation xi wi + b = 0 gives the boundary region of
the net input.
The equation denoting this decision boundary can
represent a line or plane.
On training, if the weights of training input vectors of
correct response +1 lie on one side of the boundary and
that of -1 lie on the other side of the boundary, then the
problem is linearly separable.
x1 w1 + x2 w2 + b = 0
b
w1
x2
x1
w2
w2
Linear Separability
Vectors to be Classified
Vectors to be Classified
1.5
0.5
0.5
P(2)
P(2)
1.5
-0.5
-0.5
-1
-1
-0.8
-0.6
-0.4
-0.2
P(1)
0.2
0.4
0.6
-0.8
-0.6
-0.4
-0.2
P(1)
0.2
0.4
0.6
ADALINE Network
Adaptive Linear Neural Element Network
Output values are bipolar (-1 or 1)
Inputs could be binary, bipolar or real valued
Bias weight could be +1
Learning algorithm (Delta Rule)
yj = 1 if Netj > 0
= -1 otherwise
Weight Adjustment:
Wij (k+1) = Wij (k) + (t-y)xi
MADALINE Network
Developed by Bernard Widrow
Multiple ADALINE Network
Combining a number of ADALINE Networks
spread across multiple layers with adjustable
weights
The use of multiple ADALINEs help counter
the problem of non-linear separability
Oi1
V11
Ih1
Oh1
W11
W21
V21 V
31
Io1
Oo1
W31
V12
Ii2
Oi2
V22
Ih2
Oh2
Io2
Oo2
Io3
Oo3
V32
Ii3
V23
Oi3
V33
V13
Ih3
Oh3
O h
sigmoidal gain
fh threshold of
the hidden layer
1 e
I h fh
O o
I o fo
1 e
W
Oo I o W
E
T Oo
OO
OO
Oo 1 Oo
IO
IO
Oh
W
E
T Oo Oo 1 Oo Oh
W
E
W
W
V
Oo Io Oh Io V
E
T Oo Oo 1 Oo WOh 1 Oh I i
V
E
V
V
E
i 1
W
W i
W
E
i 1
V
V i
V
W i 1 W i W i
V i 1 V i V i
Associative Memory
Developed by John Hopfield
Single layer feed forward or recurrent
network which makes use of Hebbian
learning or Gradient Descent learning rule
A storehouse of associated patterns
If the associated pattern pairs (x,y) are
different heteroassociative memory.
If x and y refer to the same pattern
autoassociative memory.
Heterocorrelators and Autocorrelators
Autocorrelators
Hopfield Associative Memory
Connection matrix is indicative of the association of the
pattern with itself m
A A
T
i
i 1
Autocorrelators recall
equation
Two parameter bipolar
threshold equation
Hamming Distance of
vector X from Y
old
anew
=
f
a
t
,
a
j
i ij
j
1, if > 0
f ( , ) = , if = 0
- 1, if < 0
HD x, y
x
i 1
yi
Heterocorrelators
Developed by Bart Kosko
Bidirectional Associative Memory ability to
recall stored pairs
Two-layer recurrent networks
There are N training pairs {(A1,B1), (A2,B2), --- }
Ai = (ai1, ai2, ai3 .. ain)
Bi = (bi1, bi2, bi3 .. bin)
m
Correlation Matrix M =
[ A ][ B ]
T
i
i =1
Heterocorrelators
Recall equation:
= (M)
= (MT)
(F) = G= g1, g2, g3, gn
F= (f1, f2, f3, fn )
( f i ) =
1, f i > 0
gi , fi = 0
-1, f i < 0
Character Recognition
Competitive Network
Clustering Technique
Vector Quantization is a method of dynamic
allocation of cluster centers
THANKYOU