Welcome to Scribd!

0% found this document useful (0 votes)

29 views

Gradient Descent Learning: Minimize Objective Function: Error Landscape

Uploaded by

The document describes gradient descent learning and how it can be used to minimize error in neural networks. Gradient descent works by taking small steps in the negative gradient direction of the error surface to reach a local minimum. This is done by updating weights according to the formula: Wi := Wi - η∇E(W), where η is the learning rate and ∇E(W) is the gradient of the error with respect to the weights. The document explains how to derive the gradient descent learning rule for different types of neural network units like linear, sigmoid, and tanh units by calculating the derivative of the error function with respect to the weights.

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Gradient Descent Learning: Minimize Objective Function: Error Landscape

Uploaded by

Pooja Patwari

0% found this document useful (0 votes)

29 views14 pages

Original Title

Gradient Learning

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

29 views14 pages

Gradient Descent Learning: Minimize Objective Function: Error Landscape

Uploaded by

Pooja Patwari

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Jump to Page

You are on page 1of 14

Search inside document

Gradient Descent Learning: Minimize

Objective Function

Error Landscape
SSE:
Sum
Squared
Error
S (ti – zi)2

0
Weight Values

1
Minimizing the Error
initial error
Error surface

negative derivative

final error

local minimum

winitial wtrained

positive change
Gradient Descent Training rule
• ∇E(w) = gradient of
error in weight space.
• Wi := Wi + δWi, where
• δWi = -η∇E(W)
• This process always
converges (towards
minimum error).
• This is summed over
all training cases
(batch).
Moving in the weight space
to the point that minimizes
Squared error on the training set
Deriving a Gradient Descent Learning
Algorithm
• Goal is to decrease overall error (or other objective function) each time a
weight is changed
• Total Sum Squared error is one possible objective function E: S (ti – zi)2
• Seek a weight changing algorithm such that error gradient is negative
• If a formula can be found then we have a gradient descent learning
algorithm

  E E E 
Gradient E[ w]   , ,..., 
 w0 w1 wn 

Training rule : wi   E[ w]
E
i.e., wi  
wi
4
Linear unit gradient descent training rule
Guaranteed to converge with minimum
squared error :
Given sufficiently small learning rate 
Even when training data contains noise
Even when training data not separable
 
Δw  rE w 
E
Δw i  r
w i

E  1 2   2

     
1
  t x  o x    t  x   o  x  
w i w i 2
 xD  2 xD  w i 

      

1
 2 t  x   o x  
t x   ox   t x   ox  t x   w  x 
2 xD  w i  xD  w i 
E
 t x   ox  x i 
w i xD
5
Measuring error for linear output
(not perceptron)
• Linear Output Function
  
 ( x)  w  x
• Error Measure:

 1
E ( w)   (td  od ) 2

2 dD
data target linear unit
value output
What about Perceptron ?
• Recall in a perceptron

Sign(W.X) > or < zero

Can we apply Gradient Descent to perceptron classifier ?

7
Non-Linear activation functions

8
Imp relationship between sigmoid and tanh
(try the proofs )
1
x

1
1  tanh  2 x 
1
1 e 2

2
tanh( x)  2 x
1
1 e
MSE Gradient for a non-linear Units
E  1
  d d
wi wi 2 dD
(t  o ) 2
But we know :
 od  (net d )

1
 (t d  od ) 2   od (1  od )
2 d wi net d net d
 
1  net d  ( w  xd )
  2(t d  od )
wi
(t d  od )   xi ,d
2 d wi wi
 od 
  (t d  od )   So :
d  wi  E
od net d    (t d  od )od (1  od ) xi ,d
 -  (t d  od ) wi d D
d net d wi

10
Sigmoid function
Continuous, differentiable
easy to compute.
This is how the derivative looks->

11
12
Practice problem
• For the given problem, apply gradient descent
learning to update the weights for one epoch (
applying the weight updates to all the data in
the training set).
• In the next slide I have provided the solution.
• Just check out if you can get the calculations
on your own.
• Use a sigmoid function as the processing unit.

CS 478 - Perceptrons 13
14

Pursuing
Document8 pages
Pursuing
dricardo1
100% (1)
3 DeltaRule PDF
Document10 pages
3 DeltaRule PDF
Es E
No ratings yet
GD-Example 7
Document15 pages
GD-Example 7
kumararyan0505
No ratings yet
Gradient Descent Based Learners
Document11 pages
Gradient Descent Based Learners
sandt
No ratings yet
Computational engineering COE 311K Midterm, part 2.3 - RK4, 2nd Order ODE, Stability (1) (1)
Document3 pages
Computational engineering COE 311K Midterm, part 2.3 - RK4, 2nd Order ODE, Stability (1) (1)
dhruv
No ratings yet
Lec 7
Document21 pages
Lec 7
Perike Chandra Sekhar
No ratings yet
hw4_red
Document6 pages
hw4_red
chonleda777
No ratings yet
Fundamentals For Finite Element Method
Document34 pages
Fundamentals For Finite Element Method
nader
No ratings yet
Boundary-Value Ordinary Differential Equations: CE 601: Numerical Methods
Document17 pages
Boundary-Value Ordinary Differential Equations: CE 601: Numerical Methods
Amlan Shome
No ratings yet
2 Multipleintegrations
Document17 pages
2 Multipleintegrations
Wan Syazwan
No ratings yet
Linear Regression Class6
Document15 pages
Linear Regression Class6
Mohit Sitlani
No ratings yet
Dijkstra's Algorithm
Document7 pages
Dijkstra's Algorithm
eesha
No ratings yet
Chapter 5: Unfolding: Z Introduction
Document27 pages
Chapter 5: Unfolding: Z Introduction
Jawad Hussine
No ratings yet
Final Exam: ECH 5261 Advanced Transport Phenomena Fall 2020
Document4 pages
Final Exam: ECH 5261 Advanced Transport Phenomena Fall 2020
Darnell Houck
No ratings yet
ENGR 233 Lecture 19 Final
Document8 pages
ENGR 233 Lecture 19 Final
Ziad Jreij
No ratings yet
Digital Image Procesing: Discrete Cosinetrasform (DCT) in Image Processing
Document19 pages
Digital Image Procesing: Discrete Cosinetrasform (DCT) in Image Processing
Barath Kandappan
No ratings yet
PH108 - Electricity and Magnetism: Lecture - 2
Document23 pages
PH108 - Electricity and Magnetism: Lecture - 2
amar Baronia
No ratings yet
L - 3: First Order & First Degree Differential Equations
Document15 pages
L - 3: First Order & First Degree Differential Equations
Musayeb Hossain
No ratings yet
Divergence Is An Operation On A Vector Yielding A Scalar, Just Like The Dot Product. We Define The Del Operator As A Vector Operator
Document23 pages
Divergence Is An Operation On A Vector Yielding A Scalar, Just Like The Dot Product. We Define The Del Operator As A Vector Operator
Kybs nyhu
No ratings yet
Exam 2 Review
Document3 pages
Exam 2 Review
Joseph Cannon
No ratings yet
Unit 1 Further Differentiation and Integration: Mathematics - MATH 1111
Document62 pages
Unit 1 Further Differentiation and Integration: Mathematics - MATH 1111
Akshay Bundhoo
No ratings yet
+++unit 2 - Analytical Solution Techniques For ODEs
Document63 pages
+++unit 2 - Analytical Solution Techniques For ODEs
basheribrahim47
No ratings yet
Lecture 02
Document37 pages
Lecture 02
Tim Widmoser
No ratings yet
Ch4 Flexural Elements
Document53 pages
Ch4 Flexural Elements
Waleed Tayyab
No ratings yet
ANN - Ch2-Adaline and Madaline
Document27 pages
ANN - Ch2-Adaline and Madaline
Alfredo Valle Hernández
No ratings yet
Module Ii HW Solns
Document17 pages
Module Ii HW Solns
Hasaan Haq
No ratings yet
Example - Haar Wavelets: L D D D
Document35 pages
Example - Haar Wavelets: L D D D
ashoksakjnij
No ratings yet
Today: - Calculus
Document61 pages
Today: - Calculus
Jose Ramon Villatuya
No ratings yet
Machine Learning 10-701 Final Exam May 5, 2015: Obvious Exceptions For Pacemakers and Hearing Aids
Document17 pages
Machine Learning 10-701 Final Exam May 5, 2015: Obvious Exceptions For Pacemakers and Hearing Aids
Nithin
No ratings yet
M104R-Sec 14.1
Document5 pages
M104R-Sec 14.1
selincetin51
No ratings yet
Multi-Rate Digital Signal Processing
Document38 pages
Multi-Rate Digital Signal Processing
anon_326727214
No ratings yet
1155_CS_F425_20230524120823_Mid_Semester_Question_Paper_DL
Document5 pages
1155_CS_F425_20230524120823_Mid_Semester_Question_Paper_DL
Mahesh Kadapa
No ratings yet
Computational Fluid Dynamics : February 28
Document68 pages
Computational Fluid Dynamics : February 28
Tatenda Nyabadza
No ratings yet
Chapter 4 PDF
Document51 pages
Chapter 4 PDF
Faheem Gulzar
No ratings yet
MTH401 Final Term Solved Subjective File
Document5 pages
MTH401 Final Term Solved Subjective File
Muhammad Hanzala
No ratings yet
4 Kinematika 2 Dimensi
Document37 pages
4 Kinematika 2 Dimensi
Nadia Putri
No ratings yet
Unit 2
Document39 pages
Unit 2
ravi2692kumar
No ratings yet
hw6 Solution PDF
Document11 pages
hw6 Solution PDF
Janani BME
No ratings yet
1. Limit, Basic Differentiation & Integrations total two columns
Document1 page
1. Limit, Basic Differentiation & Integrations total two columns
nitya0590
No ratings yet
Simon Chapter 3
Document12 pages
Simon Chapter 3
shreyas sr
No ratings yet
CISE301-Topic7 ODE
Document27 pages
CISE301-Topic7 ODE
mysterymirzan1234
No ratings yet
N U X X F U A U C: 1-D Discrete Cosine Transform
Document18 pages
N U X X F U A U C: 1-D Discrete Cosine Transform
Chandramouleeswaran Sarma
No ratings yet
Lecture-10-13 Non Homogeneous PDE
Document6 pages
Lecture-10-13 Non Homogeneous PDE
Nabila Prapty
No ratings yet
ANN MODULE 1 Part2
Document58 pages
ANN MODULE 1 Part2
yaminisatish461
No ratings yet
Cauchy-Euler Equation:: Dy D Y Dy Ax Ax Ax Ayq DX DX DX Aa Aa Xe DT X E T X DX X Xe T X D D DX D DT XD
Document9 pages
Cauchy-Euler Equation:: Dy D Y Dy Ax Ax Ax Ayq DX DX DX Aa Aa Xe DT X E T X DX X Xe T X D D DX D DT XD
Sanjar Abbasi
No ratings yet
Sheet #6 Ensemble + Neural Nets + Linear Regression + Backpropagation + CNN
Document4 pages
Sheet #6 Ensemble + Neural Nets + Linear Regression + Backpropagation + CNN
rowaida elsayed
No ratings yet
Assignment4 Solution
Document11 pages
Assignment4 Solution
RAHUL KUMAR
No ratings yet
Ece18898g Neural Networks
Document47 pages
Ece18898g Neural Networks
sai
No ratings yet
Bmats201 Notes Ajiet
Document233 pages
Bmats201 Notes Ajiet
sreyadeepthi84
No ratings yet
Welcome: POLITEHNICA University of Bucharest Faculty of Aerospace Engineering
Document47 pages
Welcome: POLITEHNICA University of Bucharest Faculty of Aerospace Engineering
Aaron Jackson
No ratings yet
CNUW_6.3Suppl_Mat_Recurrent_Networks
Document17 pages
CNUW_6.3Suppl_Mat_Recurrent_Networks
handong ji
No ratings yet
Triple Integrals (Where Limits Are Given)
Document11 pages
Triple Integrals (Where Limits Are Given)
Kalash Rana
No ratings yet
Error
Document24 pages
Error
nourhan fahmy
No ratings yet
ECS171: Machine Learning: Lecture 8: VC Dimension (LFD 2.2)
Document43 pages
ECS171: Machine Learning: Lecture 8: VC Dimension (LFD 2.2)
svwnerlgwr
No ratings yet
TMK 2023 0405 Chap 0.0
Document13 pages
TMK 2023 0405 Chap 0.0
Miu Makri
No ratings yet
Least Mean Square (LMS) Algorithm: 3.1 Spatial Filtering
Document16 pages
Least Mean Square (LMS) Algorithm: 3.1 Spatial Filtering
terrorindarkness
No ratings yet
Ex No 1 Deflection of Beams
Document4 pages
Ex No 1 Deflection of Beams
Ramesh k
No ratings yet
7 3 Eigenvalue Method For Linear Systems
Document8 pages
7 3 Eigenvalue Method For Linear Systems
Arheta Aretha
No ratings yet
Lecture Three Multi-Layer Perceptron: Backpropagation: Part I: Fundamentals of Neural Networks
Document70 pages
Lecture Three Multi-Layer Perceptron: Backpropagation: Part I: Fundamentals of Neural Networks
Wantong Liao
No ratings yet
Neural Networks Three
Document60 pages
Neural Networks Three
junleeeeee3
No ratings yet
Green's Function Estimates for Lattice Schrödinger Operators and Applications
From Everand
Green's Function Estimates for Lattice Schrödinger Operators and Applications
Jean Bourgain
No ratings yet
SVM Optimization: Derivation of The Lagrangian Dual
Document13 pages
SVM Optimization: Derivation of The Lagrangian Dual
Pooja Patwari
No ratings yet
Soft Max
Document6 pages
Soft Max
Pooja Patwari
No ratings yet
Support Vector Machines (SVM) : N I y X D
Document5 pages
Support Vector Machines (SVM) : N I y X D
Pooja Patwari
No ratings yet
Kernel Methods: Feature Mapping at No Cost
Document25 pages
Kernel Methods: Feature Mapping at No Cost
Pooja Patwari
No ratings yet
Non-Linear Classifiers
Document19 pages
Non-Linear Classifiers
Pooja Patwari
No ratings yet
Introduction To SVM
Document24 pages
Introduction To SVM
Pooja Patwari
No ratings yet
Introduction To ML
Document39 pages
Introduction To ML
Pooja Patwari
100% (1)
GD in LR
Document23 pages
GD in LR
Pooja Patwari
No ratings yet
Backpropagation Learning in Neural Networks
Document27 pages
Backpropagation Learning in Neural Networks
Pooja Patwari
No ratings yet
Notes EIC17103 11 8 20 PDF
Document8 pages
Notes EIC17103 11 8 20 PDF
Pooja Patwari
No ratings yet
Your Name Here: Personal Statement
Document3 pages
Your Name Here: Personal Statement
aschalew shimels
No ratings yet
Lab Exercise - Lesson 3
Document8 pages
Lab Exercise - Lesson 3
Claine Resendo
No ratings yet
Matrix Chain Multiplication
Document13 pages
Matrix Chain Multiplication
Abdallahi Sidi
No ratings yet
Cellphone Bike Mount From PVC - All
Document8 pages
Cellphone Bike Mount From PVC - All
Ahmad Muzaki
No ratings yet
Centralne Smarowanie Twin-3
Document48 pages
Centralne Smarowanie Twin-3
slawny77
No ratings yet
Provided by International Journal of Public Budgeting, Accounting and Finance (IJPBAF)
Document13 pages
Provided by International Journal of Public Budgeting, Accounting and Finance (IJPBAF)
Phượng Vi
No ratings yet
Tle-10 Ict Quarter 1 Module 2 (Babatuan)
Document13 pages
Tle-10 Ict Quarter 1 Module 2 (Babatuan)
Shua Hong
No ratings yet
Installation Guide PDF
Document6 pages
Installation Guide PDF
armin
100% (1)
Documentation 30 PC
Document50 pages
Documentation 30 PC
Touqeer Awan
No ratings yet
WK2-Math-G6-H.W Practice Sheet 2
Document2 pages
WK2-Math-G6-H.W Practice Sheet 2
eslamahmed.mc
No ratings yet
END Xioami Agree
Document2 pages
END Xioami Agree
Vitaliii Fedusiak
No ratings yet
Synt 70 FTS: Roller Bearing Plummer Block Units, For Metric Shafts
Document4 pages
Synt 70 FTS: Roller Bearing Plummer Block Units, For Metric Shafts
Khôi Nguyễn Thanh
No ratings yet
Teledyne ICM Catalogue SEC 2018
Document15 pages
Teledyne ICM Catalogue SEC 2018
Rafael Pertile Carneiro
No ratings yet
Def (Aust) 5085bpt1
Document28 pages
Def (Aust) 5085bpt1
csau123
No ratings yet
CMSS 2200T - 20221122
Document2 pages
CMSS 2200T - 20221122
Cristobal Araya
No ratings yet
Computer Science Dissertation Example
Document7 pages
Computer Science Dissertation Example
PaperWritingHelpEverett
100% (1)
SPE For NAS Release Notes
Document11 pages
SPE For NAS Release Notes
RobertFenea
No ratings yet
AP-3000 - SM - for-HW03 - 53-36819 03
Document79 pages
AP-3000 - SM - for-HW03 - 53-36819 03
gautamchandra.gk
No ratings yet
A Ccom M Odation Message: From: To: M R and Mrs Baker Subject: Accommodation S e N T: 16 July 2011
Document1 page
A Ccom M Odation Message: From: To: M R and Mrs Baker Subject: Accommodation S e N T: 16 July 2011
Alaska Young
No ratings yet
3D Scanning
Document3 pages
3D Scanning
mugadza.joseph86
No ratings yet
Synnex Microsoft Evaluations With Block 64
Document21 pages
Synnex Microsoft Evaluations With Block 64
Bijdsgh
No ratings yet
Elp 452
Document4 pages
Elp 452
Prince
No ratings yet
De Thi Mon Tieng Anh Vao Lop 10 Tinh Bac Lieu 2024
Document3 pages
De Thi Mon Tieng Anh Vao Lop 10 Tinh Bac Lieu 2024
nhataa00
No ratings yet
Px-501 (Genii/Px-501B) : User'S Manual
Document55 pages
Px-501 (Genii/Px-501B) : User'S Manual
Nitipong Songsangsri
No ratings yet
Unit-4 MIS
Document18 pages
Unit-4 MIS
Awesome Vids
No ratings yet
Chia Doi:: CODE Phuong Phap Tinh Ete - Dut
Document15 pages
Chia Doi:: CODE Phuong Phap Tinh Ete - Dut
Rua Tuệ Đoàn
No ratings yet
VPLEX Architecture
Document3 pages
VPLEX Architecture
Abhi Nadipally
No ratings yet
W54xCZ ESM
Document94 pages
W54xCZ ESM
Carlos Zarate
No ratings yet
Mod 12 - Lab - Implement Azure File Sync
Document8 pages
Mod 12 - Lab - Implement Azure File Sync
jacob600
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Gradient Descent Learning: Minimize Objective Function: Error Landscape

Uploaded by

Copyright:

Available Formats

Gradient Descent Learning: Minimize Objective Function: Error Landscape

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Gradient Descent Learning: Minimize Objective Function: Error Landscape

Uploaded by

Copyright:

Available Formats

Gradient Descent Learning: Minimize

Sign(W.X) > or < zero

Can we apply Gradient Descent to perceptron classifier ?

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.