0% found this document useful (0 votes)
6 views81 pages

Ann L1

Artificial neural networks (ANNs) are computational models inspired by the human brain that are made up of simple processing units (neurons) connected by synaptic connections. Knowledge is stored in the synaptic connection strengths between neurons and is acquired through a learning process. ANNs have properties like flexible inputs, resistance to errors, and the ability to handle high-dimensional and noisy data. They have seen renewed interest due to advances in computing power, large datasets, and training techniques that have allowed deep neural networks to achieve remarkable results in applications like computer vision, natural language processing, and speech recognition.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views81 pages

Ann L1

Artificial neural networks (ANNs) are computational models inspired by the human brain that are made up of simple processing units (neurons) connected by synaptic connections. Knowledge is stored in the synaptic connection strengths between neurons and is acquired through a learning process. ANNs have properties like flexible inputs, resistance to errors, and the ability to handle high-dimensional and noisy data. They have seen renewed interest due to advances in computing power, large datasets, and training techniques that have allowed deep neural networks to achieve remarkable results in applications like computer vision, natural language processing, and speech recognition.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 81

Advanced Artificial Neural

Networks
Dr. Tehseen Zia
What is Artificial Neural Networks
• Computational models inspired by the human brain:
• Algorithms that try to mimic the brain.
• Massively parallel, distributed system, made up of simple
processing units (neurons)
• Synaptic connection strengths among neurons are used to store
the acquired knowledge.
• Knowledge is acquired by the network from its environment
through a learning process
Properties
• Inputs are flexible
• any real values
• Typically take vectors
• Target function may be discrete-valued, real-valued, or
vectors of discrete or real values
• Outputs are real numbers between 0 and 1
• Resistant to errors in the training data
• Long training time
• The function produced can be difficult for humans to
interpret
When to consider neural networks
• Input is high-dimensional discrete or raw-valued
• Output is discrete or real-valued
• Output is a vector of values
• Possibly noisy data
• Form of target function is unknown
• Human readability of the result is not important
Examples:
• Image classification
• Language model
• Speech phoneme recognition
• Financial prediction
History
• Early Beginnings (1940s - 1950s):
• The concept of artificial neurons and neural networks was first introduced in the 1940s by Warren
McCulloch and Walter Pitts, who proposed a mathematical model of a simplified neuron.
• In 1958, Frank Rosenblatt developed the Perceptron, a single-layer neural network designed for
binary classification tasks.
• Limitations and the Perceptron Controversy (1960s):
• Despite initial excitement, the Perceptron had limitations and could only solve linearly separable
problems.
• A famous study by Marvin Minsky and Seymour Papert in 1969 highlighted the limitations of single-
layer perceptrons, leading to a period of skepticism about neural networks.
• Gradient Based Learning (1980s):
• In the 1980s, researchers like David Rumelhart, Geoffrey Hinton, and James McClelland contributed to
the development of parallel distributed processing models, which laid the groundwork for modern
neural networks.
• They invented gradient based learning for training neural networks.
• They demonstrated the power of multi-layer neural networks and introduced the backpropagation
algorithm for training them.
• Convolutional Neural Networks (CNNs) (1980s - 1990s):
• Yann LeCun and others developed Convolutional Neural Networks (CNNs) in the late 1980s, particularly for
image recognition tasks.
History
• Recurrent Neural Networks (RNNs) (1980s - 1990s):
• RNNs, designed for sequence data, were developed during this period. They
found applications in natural language processing and speech recognition.
• AI Winter (1990s):
• Research in ANNs faced challenges and setbacks, leading to a period known
as the "AI winter" where funding and interest in artificial intelligence waned.
• Resurgence of Deep Learning (2000s - Present):
• The 2000s saw a resurgence of interest in ANNs, driven by more powerful
computing hardware, larger datasets, and advances in training algorithms.
• The term "deep learning" gained popularity in the 2010s, describing neural
networks with multiple hidden layers.
• Deep Learning Boom (2010s - Present):
• Deep learning achieved remarkable breakthroughs in computer vision,
natural language processing, speech recognition, leading to advancements
in applications like image recognition, machine translation, autonomous
vehicles and games.
Why Artificial Neural Network?
Why ANN?
• Hand engineered features are time-consuming, brittle and not
scalable in practice
• Can we learn the underlying features directly from data?
Why Now?
• Neural networks date back decades, so why the resurgence?
1952 Gradient decent Big Data Hardware Software
1958 Perceptron
• Learnable weights • Large Datasets • Graphic Processing • Improved
• Easier Collection and Units Techniques
Storage • Massive parallelization • New Models
• Toolkits

1986 Backpropagation
• Multi-layer Perceptron

1995 Convolution Neural Network


• Digit Recognition
The Perceptron
The Structural Building block of Deep Learning
Biological Neuron
The Perceptron
Inputs Weights Sum Output
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ^
𝑦
𝑥𝑚
𝑚
∑=∑ 𝑥 𝑖 𝑤𝑖
𝑖=1

1 𝑖𝑓 ∑ ≥ 0
where 𝑔 ( ∑ )=
− 1 𝑜𝑡h𝑒𝑟𝑤𝑖𝑠𝑒
The Perceptron
Inputs Weights Sum Output
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ^
𝑦
𝑥𝑚
Linearity combination of inputs
𝑚
∑=∑ 𝑥 𝑖 𝑤𝑖
𝑖=1
Output
1 𝑖𝑓 ∑ ≥ 0
where 𝑔 ( ∑ )=
− 1 𝑜𝑡h𝑒𝑟𝑤𝑖𝑠𝑒
The Perceptron
Inputs Weights Sum Output
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ^
𝑦
𝑥𝑚

1 𝑖𝑓 ∑ ≥ 0
where 𝑔 ( ∑ )=
− 1 𝑜𝑡h𝑒𝑟𝑤𝑖𝑠𝑒
The Perceptron
Inputs Weights Sum Output
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ^
𝑦
𝑥𝑚
=
X=
[ ]
𝑥1
𝑥2
W=
[−32 ]
Example )
The Perceptron
Inputs Weights Sum Output
∑=3 𝑥 1 −2 𝑥2
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ^
𝑦
𝑥𝑚
=
𝑥1
X=
[ ]
𝑥1
𝑥2
W=
[ ]
3
−2

Example )

𝑥
The Perceptron
Inputs Weights Sum Output
∑=3 𝑥 1 −2 𝑥2
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ^
𝑦 (2,3)

𝑥𝑚
= (0,0)
𝑥1
X=
[ ]
𝑥1
𝑥2
W=
[ ]
3
−2

Example )
(-2,-3)

𝑥
The Perceptron
Inputs Weights Sum Output
∑=3 𝑥 1 −2 𝑥2
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ^
𝑦 (2,3)

𝑥𝑚
= (0,0)
𝑥1
X=
[ ]
𝑥1
𝑥2
W=
[ ]
3
−2

Example )
(-2,-3)

𝑥
The Perceptron
Inputs Weights Sum Output
∑=3 𝑥 1 −2 𝑥2
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ^
𝑦 (2,3)

𝑥𝑚
= (0,0)
𝑥1
X=
[ ]
𝑥1
𝑥2
W=
[ ]
3
−2

Example )
(-2,-3)

𝑥
Implementing AND with Perceptron

𝑥 2 𝐿𝑜𝑔𝑖𝑐𝑎𝑙 𝐴 𝑁𝐷

(0,1) (1,1)

(0,0)
𝑥1
(1,0)
Implementing AND with Perceptron

𝑥 2 𝐿𝑜𝑔𝑖𝑐𝑎𝑙 𝐴 𝑁𝐷

(0,1) (1,1)

(0,0)
𝑥1
(1,0)
Implementing AND with Perceptron

𝑥 2 𝐿𝑜𝑔𝑖𝑐𝑎𝑙 𝐴 𝑁𝐷
𝑥 1
𝑤 1=1
^
𝑦 (0,1) (1,1)

𝑥
∑=𝑤 1 𝑥 1+𝑤 2 𝑥 2 ¿ 2
2 1

(0,0)
𝑥1
(1,0)
=1
=0
Implementing AND with Perceptron

𝑥 2 𝐿𝑜𝑔𝑖𝑐𝑎𝑙 𝑂 𝑅

(0,1) (1,1)

(0,0)
𝑥1
(1,0)
Implementing AND with Perceptron

𝑥 2 𝐿𝑜𝑔𝑖𝑐𝑎𝑙 𝑂 𝑅

(0,1) (1,1)

(0,0)
𝑥1
(1,0)
Implementing OR with Perceptron

𝑥 2 𝐿𝑜𝑔𝑖𝑐𝑎𝑙 𝑂 𝑅
𝑥 1
𝑤 1=1
^
𝑦 (0,1) (1,1)

𝑥
∑=𝑤 1 𝑥 1+𝑤 2 𝑥 2 ¿ 1
2 1

(0,0)
𝑥1
(1,0)
=1
=0
Non Linearly Separable Problems
What if we want to distinguish red versus green points

Most real-word problem are non linearly separable


Can Multiple Perceptron Solve Non
Linearly Separable Problems
Can Multiple Perceptron Solve Non
Linearly Separable Problems
Can Multiple Perceptron Solve Non
Linearly Separable Problems

We cannot because
XOR is nonlinearly
separable
Can Multiple Perceptron Solve Non
Linearly Separable Problems

Perceptron # 2

Perceptron # 1
Can Multiple Perceptron Solve Non
Linearly Separable Problems

Decision rule:
if ∑ of P1 < 0 -> black
elseif ∑ of P2 > 0 -> black
else white

Perceptron # 2

Perceptron # 1
Can Multiple Perceptron Solve Non
Linearly Separable Problems

Decision rule:
if ∑ of P1 < 0 -> black
elseif ∑ of P2 > 0 -> black
else white

Perceptron # 2

Perceptron # 1
Multi-perceptron Architecture
Perceptron # 1

𝑤
𝑥1 11 ∑ 𝑤11
𝑤12 ^
𝑦
𝑤21 ∑
𝑥 2 𝑤22 ∑ 𝑤21

Perceptron # 2 Perceptron # 3
Multi-perceptron Architecture
Perceptron # 1

𝑤
𝑥1 11 ∑ 𝑤11
𝑤12 ^
𝑦
𝑤21 ∑
𝑥 2 𝑤22 ∑ 𝑤21

Perceptron # 2 Perceptron # 3
Multi-perceptron Architecture
Perceptron # 1

𝑤
𝑥1 11 ∑ 𝑤11
𝑤12 ^
𝑦
𝑤21 ∑
𝑥 2 𝑤22 ∑ 𝑤21

Perceptron # 2 Perceptron # 3
Multi-perceptron Mathematically
• Perceptron 1:
• Perceptron 2:
• Perceptron 3:
Multi-perceptron Mathematically
• Perceptron 1:
• Perceptron 2:
• Perceptron 3:
+++
Multi-perceptron Mathematically
• Perceptron 1:
• Perceptron 2:
• Perceptron 3:
+++
+

Linear function
Sum of linear function is a linear function
Multi-perceptron Architecture
Perceptron # 1

𝑤
𝑥1 11 ∑ 𝑤11
𝑤12 ^
𝑦
𝑤21 ∑
𝑥 2 𝑤22 ∑ 𝑤21

Perceptron # 2 Perceptron # 3
Multi-perceptron Architecture
Perceptron # 1

𝑤
𝑥1 11 ∑ 𝛿 𝑤11
𝑤12 ^
𝑦
𝑤21 ∑ 𝛿
𝑥 2 𝑤22 ∑ 𝑤21
𝛿
Perceptron # 2 Perceptron # 3
Multi-perceptron Architecture
Perceptron # 1
Non-linearity
𝑤
𝑥1 11 ∑ 𝛿 𝑤11
𝑤12 ^
𝑦
𝑤21 ∑ 𝛿
𝑥 2 𝑤22 ∑ 𝑤21
𝛿
Perceptron # 2 Perceptron # 3
The Perceptron
Inputs Weights Sum Output
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ^
𝑦
𝑥𝑚
Linearity combination of inputs
𝑚
∑=∑ 𝑥 𝑖 𝑤𝑖
𝑖=1
Output
1 𝑖𝑓 ∑ ≥ 0
where 𝑔 ( ∑ )=
− 1 𝑜𝑡h𝑒𝑟𝑤𝑖𝑠𝑒
The Perceptron
Inputs Weights Sum Non linearity Output
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦
𝑥𝑚
Linearity combination of inputs
𝑚
𝑧=∑ 𝑥 𝑖 𝑤𝑖
𝑖=1
Output

where
The Perceptron
Inputs Weights Sum Non linearity Output
𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦
𝑥𝑚
Linearity combination of inputs
𝑚
𝑧=∑ 𝑥 𝑖 𝑤𝑖

𝑔(𝑧)
𝑖=1
Output

where
𝑧
Importance of Activation Function

Linear activation function produce linear Non-linearity allow us to approximate


decision no matter the network size arbitrarily complex functions
Building Neural Network with
perceptron
Multi-output Perceptron
• Because all inputs are densely connected to all outputs, these layers
are called Dense layers

𝑥1
𝑦 1=𝑔 ( 𝑧 1 )
𝑧1
𝑥2 𝑦 2=𝑔 ( 𝑧 2)
𝑧2
𝑥𝑚
Single Layer Neural Network
Input Hidden Output
𝑧1
𝑔 ( 𝑧1 )
1
𝑊 𝑊
2

𝑥1 ^
𝑦1
𝑧2
𝑔 ( 𝑧 2)

𝑥2 𝑔 ( 𝑧 3)
𝑧3 ^
𝑦2
𝑥𝑚 𝑔(𝑧

𝑧3
4 )

𝑑1
𝑦 𝑖=𝑔( ∑ 𝑔 ( 𝑧 𝑖 ) 𝑤 𝑗 ,𝑖 )
2
^
𝑗=1
Single Layer Neural Network
Input Hidden Output

𝑧1
𝑔 ( 𝑧1 )
1
𝑊 𝑊
2

𝑥1 ^
𝑦1
𝑧2
𝑔 ( 𝑧 2)

𝑥2 𝑔 ( 𝑧 3)
𝑧3 ^
𝑦2
𝑥𝑚 𝑔(𝑧

𝑧3
4 )
Example Problem
Will I pass this class?

Lets start with a simple two feature model


Example Problem: Will I pass this class?

𝑥 2 = 𝐻𝑜𝑢𝑟𝑠 𝑠𝑝𝑒𝑛𝑑 𝑜𝑛 𝑝𝑟𝑜𝑗𝑒𝑐𝑡


Legend

Pass
Fail

𝑥1 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑙𝑒𝑐𝑡𝑢𝑟𝑒𝑠 𝑦𝑜𝑢 𝑎𝑡𝑡𝑒𝑛𝑑


Example Problem: Will I pass this class?

𝑥 2 = 𝐻𝑜𝑢𝑟𝑠 𝑠𝑝𝑒𝑛𝑑 𝑜𝑛 𝑝𝑟𝑜𝑗𝑒𝑐𝑡


? [ 45 ]
Legend

Pass
Fail

𝑥1 =𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑙𝑒𝑐𝑡𝑢𝑟𝑒𝑠 𝑦𝑜𝑢 𝑎𝑡𝑡𝑒𝑛𝑑


Single Layer Neural Network

𝑧1
𝑔 ( 𝑧1 )
1
𝑊 𝑊
2

𝑥1
𝑧2 ^
𝑦 1 Predicted = 0.1
𝑔 ( 𝑧 2)

[4 5]
𝑥2 𝑔 ( 𝑧 3)
𝑧3
Single Layer Neural Network

𝑧1
𝑔 ( 𝑧1 )
1
𝑊 𝑊
2

𝑥1
𝑧2 ^
𝑦 1 Predicted = 0.1
𝑔 ( 𝑧 2)

[4 5]
𝑥2 𝑔 ( 𝑧 3) Actual = 1
𝑧3
Single Layer Neural Network

𝑧1
𝑔 ( 𝑧1 )
1
𝑊 𝑊
2

𝑥1
𝑧2 ^
𝑦 1 Predicted = 0.1
𝑔 ( 𝑧 2)

[4 5]
𝑥2 𝑔 ( 𝑧 3) Actual = 1
𝑧3
Quantifying Loss
The loss of our network measures the cost incurred from incorrect predictions
𝑧1
𝑔 ( 𝑧1 )
1
𝑊 𝑊
2

𝑥1
𝑧2 ^
𝑦 1 Predicted = 0.1
𝑔 ( 𝑧 2)

[4 5]
𝑥2 𝑔 ( 𝑧 3) Actual = 1
𝑧3

)
Predicted Actual
Empirical Loss
The empirical loss measures total loss over entire dataset
𝑧1
𝑔 ( 𝑧1 )
𝑊
1
𝑊
2 𝑓 (𝑥) 𝑦
𝑥1
[ ] [ ] [ ]
4 5 0 .1 × 1
𝑧2 ^
𝑦1
𝑔 ( 𝑧 2)
2 1 0.8 × 0
0.6 √ 1
5 8
𝑥2 𝑔 ( 𝑧 3)
⋮ ⋮ 𝑧3 ⋮

𝑛
1
𝐽 ( 𝑊 )= ∑ 𝓛 ¿ ¿
𝑛 𝑖 =1
Mean Squared Error Loss
The Mean squared error can be used with regression models that output
continuous real numbers.
𝑧1
𝑔 ( 𝑧1 )
𝑊
1
𝑊
2𝑓 (𝑥) 𝑦
𝑥1
[ ] [ ] [ ]
4 5 40 × 87
𝑧2 ^
𝑦1
𝑔 ( 𝑧 2)
2 1 85 × 65
97 √ 95
5 8
𝑥2 𝑔 ( 𝑧 3)
⋮ ⋮ 𝑧3 ⋮

𝑛
1
𝐽 ( 𝑊 )= ∑ ¿ ¿ ¿
𝑛 𝑖 =1
Training Neural Network
Loss Optimization
We want to find the network weights that achieve lowest loss
𝑛
1
𝑊 =𝑎𝑟𝑔𝑚𝑖𝑛𝑤 ∑ 𝓛 ¿ ¿

𝑛 𝑖 =1
𝑊 ∗ =𝑎𝑟𝑔𝑚𝑖𝑛𝑤 𝐽 (𝑊 )
Loss Optimization
We want to find the network weights that achieve lowest loss
𝑛
1
𝑊 =𝑎𝑟𝑔𝑚𝑖𝑛𝑤 ∑ 𝓛 ¿ ¿

𝑛 𝑖 =1
𝑊 ∗ =𝑎𝑟𝑔𝑚𝑖𝑛𝑤 𝐽 (𝑊 )

1 2
𝑊 ={𝑊 ,𝑊 ,… }
Loss Optimization
𝑥1 𝑤1

∑ ∫ ^
𝑦 𝑊 ∗ =𝑎𝑟𝑔𝑚𝑖𝑛𝑤 𝐽 (𝑊 )
𝑥2 𝑤2

𝐽 (𝑤1 ,𝑤 2)

𝑤2
𝑤1
Loss Optimization
𝑥1 𝑤1

∑ ∫ ^
𝑦 𝑊 ∗ =𝑎𝑟𝑔𝑚𝑖𝑛𝑤 𝐽 (𝑊 )
𝑥2 𝑤2
𝑅𝑎𝑛𝑑𝑜𝑚𝑙𝑦𝑝𝑖𝑐𝑘𝑎𝑛𝑖𝑛𝑖𝑡𝑖𝑎𝑙(𝑤 1,𝑤 2)

𝐽 (𝑤1 ,𝑤 2)

𝑤2
𝑤1
Loss Optimization
𝑥1 𝑤1

∑ ∫ ^
𝑦 𝑊 ∗ =𝑎𝑟𝑔𝑚𝑖𝑛𝑤 𝐽 (𝑊 )
𝑥2 𝑤2
𝜕 𝐽 (𝑊)
𝐶𝑜𝑚𝑝𝑢𝑡𝑒𝑔𝑟𝑎𝑑𝑖𝑒𝑛𝑡
𝑊

𝐽 (𝑤1 ,𝑤 2)

𝑤2
𝑤1
Loss Optimization
𝑥1 𝑤1

∑ ∫ ^
𝑦 𝑊 ∗ =𝑎𝑟𝑔𝑚𝑖𝑛𝑤 𝐽 (𝑊 )
𝑥2 𝑤2

𝑇𝑎𝑘𝑒 𝑠𝑡𝑒𝑝𝑖𝑛 𝑜𝑝𝑝𝑜𝑠𝑖𝑡 𝑠𝑖𝑑𝑒𝑜𝑓 𝑔𝑟𝑎𝑑𝑖𝑒𝑛𝑡

𝐽 (𝑤1 ,𝑤 2)

𝑤2
𝑤1
Loss Optimization
𝑥1 𝑤1

∑ ∫ ^
𝑦 𝑊 ∗ =𝑎𝑟𝑔𝑚𝑖𝑛𝑤 𝐽 (𝑊 )
𝑥2 𝑤2

𝐽 (𝑤1 ,𝑤 2)

𝑤2
𝑤1
Loss Optimization
𝑥1 𝑤1

∑ ∫ ^
𝑦 𝑊 ∗ =𝑎𝑟𝑔𝑚𝑖𝑛𝑤 𝐽 (𝑊 )
𝑥2 𝑤2

𝐽 (𝑤1 ,𝑤 2)

𝑤2
𝑤1
Gradient Descent
Quantifying Loss

𝑧1
𝑔 ( 𝑧1 )
1
𝑊 𝑊
2

𝑥1
𝑧2 ^
𝑦 1 Predicted = 0.1
𝑔 ( 𝑧 2)

[4 5]
𝑥2 𝑔 ( 𝑧 3)
𝑧3
Single Layer Neural Network

𝑧1
𝑔 ( 𝑧1 )
1
𝑊 𝑊
2

𝑥1
𝑧2 ^
𝑦1
𝑔 ( 𝑧 2)

𝑥2 𝑔 ( 𝑧 3)
𝑧3
The Perceptron: Forward Propagation
Non-Linear activation function

𝑥1 𝑤1

𝑥2 𝑤2
𝑤3
∑ ∫ ^
𝑦
Output
𝑥𝑚
Linearity combination of inputs
Inputs Weights Sum Non-Linearity Output
The Perceptron: Forward Propagation

𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦 ()

𝑥𝑚
[ ] [ ]
𝑥1 𝑤1
Where: X= ⋮ W= ⋮
𝑥𝑚 𝑤𝑚
Inputs Weights Sum Non-Linearity Output
The Perceptron: Forward Propagation

𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦
𝑥𝑚

𝑔(𝑧)
Inputs Weights Sum Non-Linearity Output

𝑧
Importance of Activation Function
• The purpose of activation function is to introduce non-linearity into
the network

What if we want to build neural network to distinguish red


versus green points
The Perceptron: Example

𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦 ()

𝑥𝑚
Where: X=
[ ]
𝑥1
𝑥2
W=
[ ]
3
−2

Inputs Weights Sum Non-Linearity Output


()
The Perceptron: Example

𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦 ()

𝑥𝑚
Where: X=
[ ]
𝑥1
𝑥2
W=
[ ]
3
−2

Inputs Weights Sum Non-Linearity Output


()

(-2)
The Perceptron: Example

𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦 ()

𝑥𝑚
Where: X=
[ ]
𝑥1
𝑥2
W=
[ ]
3
−2

Inputs Weights Sum Non-Linearity Output


()

(-2)

This is a line in 2D
The Perceptron: Example
(-2)

𝑥1 𝑤1

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦
𝑥𝑚 𝑥1

Inputs Weights Sum Non-Linearity Output

𝑥2
The Perceptron: Example
(-2)

𝑥1 𝑤1
(2,3)

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦
(0,0)
𝑥𝑚 𝑥1

Inputs Weights Sum Non-Linearity Output


(-2,-3)

𝑥2
The Perceptron: Example
(-2)

𝑥1 𝑤1
(2,3)

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦
(0,0)
𝑥𝑚 𝑥1

Inputs Weights Sum Non-Linearity Output


(-2,-3)

𝑥2
The Perceptron: Example
(-2)

𝑥1 𝑤1
(2,3)
(-1,2)

𝑥2 𝑤2
𝑤𝑚
∑ ∫ ^
𝑦
(0,0)
𝑥𝑚 𝑥1

Inputs Weights Sum Non-Linearity Output


(-2,-3)

𝑥2

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy