Learning Law in Neural Networks

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 19
At a glance
Powered by AI
The key takeaways are that neural networks have the ability to learn from their environment and improve performance through learning. Learning implies that the processing element changes its input/output behavior in response to the environment.

A perceptron is a type of linear classifier that classifies inputs into binary outputs (0 or 1) based on a weighted sum of the inputs and a bias. It was one of the earliest neural network models developed by Frank Rosenblatt in 1957.

The main difference is that in a perceptron, the weighted sum (net) is passed through an activation function before adjusting the weights, while in Adaline, the weights are adjusted directly based on the net. Adaline was developed by Widrow and Hoff in 1960 and consists of weights, bias, and a summation function.

Learning In Neural Network, it is an ability to learn from its environment and to improve its performance through learning.

Learning implies that the processing element somehow changes its input/output behavior in response to tbhe environment. For example, if the processing element originally gives an output of +1 in response to a particular input pattern, it might have an output of -1 to that same input pattern after learning takes place. The processing element has somehow changed it mind about what the correct response to that input should be. What does the processing element do to make this change? The output is computed as a result of a transfer function of the weighted input. The net input for this simple case is computed by multiplying the value of each individual input by its corresponding weight, or equivalently, taking the dot product of the input and weight vectors. The processing element then takes this input value and applies the transfer function to it to compute the resulting output. Activation function - A function by which new output of the basic unit is derived from a combination of the net inputs and the current state of the unit (the total input). axon - The part of a nerve cell through which impulses travel away from the cell body; the electrically active parts of a nerve-cell. back-propagation - A learning algorithm for a multilayer network in which the weights are modified via the propagation of an error signal "backward" from the outputs to the inputs. connection - A pathway between processing elements, either positive or negative, that links the processing elements into a network. dendrite - The branched part of a nerve cell that caries impulses toward the cell body. The electrically passive parts of a nerve cell. learning - The phase in a neural network when new data is introduced into the network, causing the weights on the processing elements to be adjusted. neuron - The structural and functional unit of the nervous system, consisting of the nerve cell body and all its processes, including an axon and one or more dendrons.

perceptron - A large class of simple neuron-like networks with only an input layer and an output layer. Developed in 1957 by Frank Rosenblatt, this class of neural network had no hidden layer. summation function - A function that combines the various input activations into a single activation. synapse - The point of contact between adjacent neurons where nerve impulses are transmitted from one to another. threshold - A minimum level of excitation energy. training - A process whereby a network learns to associate an input pattern with the correct answer. weight - The strength of an input connection expressed by a real number. Processing elements receive input via interconnects. Each interconnect has a weight attached to it. The sum of the weights make up a value that updates the processing element. The output value of a processing element is described by a level of excitation that causes interconnects to be either on (i.e. excitatory output) or off (i.e. inhibitory output).

Architecture of neural networks


Feed-forward networks Feed-forward ANNs (figure 1) allow signals to travel one way only; from input to output. There is no feedback (loops) i.e. the output of any layer does not affect that same layer. Feed-forward ANNs tend to be straight forward networks that associate inputs with outputs. They are extensively used in pattern recognition. This type of organisation is also referred to as bottom-up or top-down. Feedback networks Feedback networks (figure 1) can have signals travelling in both directions by introducing loops in the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing continuously until they reach an equilibrium point. They remain at the equilibrium point until the input changes and a new equilibrium needs to be found. Feedback architectures are also referred to as interactive or recurrent, although the latter term is often used to denote feedback connections in single-layer organisations.

Competitive learning is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to respond to a subset of the input data. A variant ofHebbian learning, competitive learning works by increasing the specialization of each node in the network. It is well suited to finding clusters within data. Models and algorithms based on the principle of competitive learning include vector quantization and selforganising maps (Kohonen maps).

Competitive Learning is usually implemented with Neural Networks that contain a hidden layer which is [2] commonly known as competitive layer. Every competitive neuron i is described by a vector of weights input data and calculates the similarity measure between the and the weight vector .

For every input vector, the competitive neurons compete with each other to see which one of them is the most similar to that particular input vector. The winner neuron m sets its output and all the other competitive neurons set their output .

Usually, in order to measure similarity the inverse of the Euclidean distance is used: between the input vector and the weight vector .

Example
Here is a simple competitive learning algorithm to find three clusters within some input data. 1. (Set-up.) Let a set of sensors all feed into three different nodes, so that every node is connected to every sensor. Let the weights that each node gives to its sensors be set randomly between 0.0 and 1.0. Let the output of each node be the sum of all its sensors, each sensor's signal strength being multiplied by its weight. 2. When the net is shown an input, the node with the highest output is deemed the winner. The input is classified as being within the cluster corresponding to that node. 3. The winner updates each of its weights, moving weight from the connections that gave it weaker signals to the connections that gave it stronger signals. Thus, as more data are received, each node converges on the centre of the cluster that it has come to represent and activates more strongly for inputs in this cluster and more weakly for inputs in other clusters.

LEARNING TYPES Supervised Or Active Learning - learning with an external teacher or a supervisor who presents training set to the network. Unsupervised Or Self-Organized Learning does not require an external teacher. During the training session, the neural network receives a number of different input patterns, discovers significant features in these patterns and learns how to classify input data into appropriate categories. Unsupervised learning can be used in real-time. Re-inforcement learning:- The output will be come with the help of feedback. If the output is match than the result will be +1 otherwise 0 or -1.

Unsupervised Learning Types:Adaptive Resonance Theory (ART) is a theory developed by Stephen Grossberg and Gail Carpenter on aspects of how the brain processes information. ART networks consist of an input layer and an output layer. Adaptive Resonance Theory (ART) networks perform completely unsupervised learning. Carpenter and Grossberg (1987) On-line clustering algorithm Recurrent ANN Competitive output layer Data clustering applications Stability-plasticity dilemma Stability: system behaviour doesnt change after irrelevant events Plasticity: System adapts its behaviour according to significant events Dilemma: how to achieve stability without rigidity and plasticity without chaos? Ongoing learning capability Preservation of learned knowledge

ART Architecture

Bottom-up weights bij Top-down weights tij Store class template Input nodes Vigilance test Input normalisation Output nodes Forward matching Long-term memory ANN weights

Short-term memory ANN activation pattern

ART Types ART1: Unsupervised Clustering of binary input vectors. ART2: Unsupervised Clustering of real-valued input vectors. ART3: Incorporates "chemical transmitters" to control the search process in a hierarchical ART structure. ARTMAP: Supervised version of ART that can learn arbitrary mappings of binary patterns. Fuzzy ART: Synthesis of ART and fuzzy logic. Fuzzy ARTMAP: Supervised fuzzy ART dART and dARTMAP: Distributed code representations in the F2 layer (extension of winner take all approach). Gaussian ARTMAP

ART 1- Analog Adaptive Resonance Theory


ART 1 is the simplest variety of ART networks, accepting only binary inputs. The basic structure of an ART1 neural network involves: o an input processing field (called the F1 layer) which happens to consist of two parts: an input portion (F1(a))

an interface portion (F1(b)) the cluster units (the F2 layer) and a mechanism to control the degree of similarity of patterns placed on the same cluster a reset mechanism weighted bottom-up connections between the F1 and F2 layers weighted top-down connections between the F2 and F1 layers

Reset Module Fixed connection weights Implements the vigilance test Excitatory connection from F1(b)

Inhibitory connection from F1(a) Output of reset module inhibitory to output layer Disables firing output node if match with pattern is not close enough Duration of reset signal lasts until pattern is present

Gain module Fixed connection weights Controls activation cycle of input layer Excitatory connection from input lines Inhibitory connection from output layer Output of gain module excitatory to input layer 2/3 rule for input layer

ART1 Example : character recognition

ART 2- Binary Adaptive Resonance Theory


ART 2 extends network capabilities to support continuous inputs. Unsupervised Clustering for : Real-valued input vectors Binary input vectors that are noisy Includes a combination of normalization and noise suppression

Fast Learning Weights reach equilibrium in each learning trial Have some of the same characteristics as the weight found by ART1 More appropriate for data in which the primary information is contained in the pattern of components that are small or large

Slow Learning Only one weight update iteration performed on each learning trial Needs more epochs than fast learning More appropriate for data in which the relative size of the nonzero components is important

ART 2-A- Binary Adaptive Resonance Theory


ART 2-A is a streamlined form of ART-2 with a drastically accelerated runtime, and with qualitative results being only rarely inferior to the full ART-2 implementation. Applications ART Natural language processing Document clustering Document retrieval Automatic query

Image segmentation

Character recognition Data mining Data set partitioning Detection of emerging clusters

Fuzzy partitioning Condition-action association

Hebbian learning In 1949, Donald Hebb proposed one of the key ideas in biological learning, commonly known as Hebbs Law. Hebbs Law states that if neuron i is near enough to excite neuron j and repeatedly participates in its activation, the synaptic connection between these two neurons is strengthened and neuron j becomes more sensitive to stimuli from neuron i. Hebbs Law can be represented in the form of two rules: 1. If two neurons on either side of a connection are activated synchronously, then the weight of that connection is increased. 2. If two neurons on either side of a connection are activated asynchronously, then the weight of that connection is decreased. Hebbs Law provides the basis for learning without a teacher. Learning here is a local phenomenon occurring without feedback from the environment.

Output Signals

Input Signals

Competitive learning In competitive learning, neurons compete among themselves to be activated. While in Hebbian learning, several output neurons can be activated simultaneously, in competitive learning, only a single output neuron is active at any time. The output neuron that wins the competition is called the winner-takes-all neuron. The basic idea of competitive learning was introduced in the early 1970s. In the late 1980s, Teuvo Kohonen introduced a special class of artificial neural networks called self-organising feature maps. These maps are based on competitive learning.

SOM What is a self-organising feature map? Our brain is dominated by the cerebral cortex, a very complex structure of billions of neurons and hundreds of billions of synapses. The cortex includes areas that are responsible for different human activities (motor, visual, auditory, somatosensory, etc.), and associated with different sensory inputs. We can say that each sensory input is mapped into a corresponding area of the cerebral cortex. The cortex is a self-organising computational map in the human brain.

The Kohonen network n The Kohonen model provides a topological mapping. It places a fixed number of input patterns from the input layer into a higher-dimensional output or Kohonen layer. Training in the Kohonen network begins with the winners neighbourhood of a fairly large size. Then, as training proceeds, the neighbourhood size gradually decreases.

x1 y2 x2 y3 Input layer Output layer

The lateral connections are used to create a competition between neurons. The neuron with the largest activation level among all neurons in the output layer becomes the winner. This neuron is the only neuron that produces an output signal. The activity of all other neurons is suppressed in the competition. The lateral feedback connections produce excitatory or inhibitory effects, depending on the distance from the winning neuron. This is achieved by the use of a Mexican hat function which describes synaptic weights between neurons in the Kohonen layer.

Supervised Learning Types


Hop Field Network:-

Output Signals

Input Signals

y1

PERCEPTRON In machine learning, the perceptron is an algorithm for supervised classification of an input into one of several possible non-binary outputs. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector describing a given input using the delta rule. The learning algorithm for perceptrons is an online algorithm, in that it processes elements in the training set one at a time. The perceptron algorithm was invented in 1957 at the Cornell Aeronautical Laboratory by Frank Rosenblatt.

The perceptron is a binary classifier which maps its input value (a single binary value):

(a real-valued vector) to an output

where is a vector of real-valued weights, is the dot product (which here computes a weighted sum), and is the 'bias', a constant term that does not depend on any input value. The value of (0 or 1) is used to classify as either a positive or a negative instance, in the case of a binary classification problem. If is negative, then the weighted combination of inputs must produce a positive value greater than in order to push the classifier neuron over the 0 threshold. Spatially, the bias alters the position (though not the orientation) of the decision boundary. The perceptron learning algorithm does not terminate if the learning set is not linearly separable. If the vectors are not linearly separable learning will never reach a point where all vectors are classified properly. The most famous example of the perceptron's inability to solve problems with linearly nonseparable vectors is the Boolean exclusive-or problem. The solution spaces of decision boundaries for all binary functions and learning behaviors are studied in the reference. In the context of artificial neural networks, a perceptron is an artificial neuron using the Heaviside step function as the activation function. The perceptron algorithm is also termed the single-layer perceptron, to distinguish it from a multilayer perceptron, which is a misnomer for a more complicated neural network. As a linear classifier, the single-layer perceptron is the simplest feedforward neural network.

ADALINE (Adaptive Linear Neuron or later Adaptive Linear Element) is an early single-layer neural [1] network and the name of the physical device that implemented this network. It was developed by Professor Bernard Widrow and his graduate student Ted Hoff at Stanford University in 1960. It is based on the McCullochPitts neuron. It consists of a weight, a bias and a summation function. The difference between Adaline and the standard (McCullochPitts) perceptron is that in the learning phase the weights are adjusted according to the weighted sum of the inputs (the net). In the standard perceptron, the net is passed to the activation (transfer) function and the function's output is used for adjusting the weights. There also exists an extension known as Madaline

Adaline is a single layer neural network with multiple nodes where each node accepts multiple inputs and generates one output. Given the following variables: x is the input vector w is the weight vector n is the number of inputs some constant y is the output

then we find that the output is then the o/p reduces to the dot product of x and w

. If we further assume that

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy