0% found this document useful (0 votes)

15 views

Unit 3 Deep Learning

Uploaded by

sahil.utube2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

Unit 3 Deep Learning

Uploaded by

sahil.utube2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

UNIT-3

Activation Function
 In artificial neural networks, each neuron forms a weighted sum of its
inputs and passes the resulting scalar value through a function referred to
as an activation function.

 An activation function determines if a neuron should be activated or not

activated. This implies that it will use some simple mathematical
operations to determine if the neuron’s input to the network is relevant or
not relevant in the prediction process.
 The ability to introduce non-linearity to an artificial neural network and
generate output from a collection of input values fed to a layer is the
purpose of the activation function.
 Activation functions are mathematical equations that determine the output
of a neural network model.
 Activation functions also have a major effect on the neural network’s
ability to converge and the convergence speed, or in some cases,
activation functions might prevent neural networks from converging in
the first place.
 Activation function also helps to normalize the output of any input in the
range between 1 to -1 or 0 to 1.
 Activation function must be efficient and it should reduce the
computation time because the neural network sometimes trained on
millions of data points.

Let’s consider the simple neural network model without any hidden layers.
Here is the output-

Y = ∑ (weights*input + bias)

So, if inputs are x1+x2+x3…. xn and the weights are w1+w2 + w3.......wn

then, the activation would be (Activation function

(x1w1+x2w2+x3w3……xnwn) +bias)

and it can range from -infinity to +infinity. So, it is necessary to bound the
output to get the desired prediction or generalized results.

Y = Activation function (∑ (weights*input + bias))

So, the activation function is an important part of an artificial neural network.

They decide whether a neuron should be activated or not and it is a non-linear
transformation that can be done on the input before sending it to the next layer
of neurons or finalizing the output.

Types of Activation functions

Activation functions can be divided into three types:

1. Linear Activation Function

2. Binary Step Function
3. Non-linear Activation Functions

Linear Activation Function

The linear activation function, often called the identity activation function, is
proportional to the input. The range of the linear activation function will be (-∞
to ∞). The linear activation function simply adds up the weighted total of the
inputs and returns the result.

Mathematically, it can be represented as:

Binary Step Activation Function

A threshold value determines whether a neuron should be activated or not
activated in a binary step activation function.

The activation function compares the input value to a threshold value. If the
input value is greater than the threshold value, the neuron is activated. It’s
disabled if the input value is less than the threshold value, which means its
output isn’t sent on to the next or hidden layer.
Mathematically, the binary activation function can be represented as:

Non-linear Activation Functions

The non-linear activation functions are the most-used activation functions. They
make it uncomplicated for an artificial neural network model to adapt to a
variety of data and to differentiate between the outputs.

Non-linear activation functions allow the stacking of multiple layers of neurons,

as the output would now be a non-linear combination of input passed through
multiple layers. Any output can be represented as a functional computation
output in a neural network.

These activation functions are mainly divided basis on their range and curves.
The remainder of this article will outline the major non-linear activation
functions used in neural networks.

Sigmoid Activation Function

 The sigmoid function is defined mathematically as 1/(1+e^(-x)), where x
is the input value and e is the mathematical constant of 2.718.
 The function maps any input value to a value between 0 and 1, making it
useful for binary classification and logistic regression problems. The
range of the function is (0,1), and the domain is (-infinity,+infinity).
 One of the key properties of the sigmoid function is its “S” shape. As the
input value increases, the output value of it starts with a slow increase,
then rapidly approaches 1, and finally levels off. This property makes a
valuable function for modeling decision boundaries in binary
classification problems.
 Another property of the sigmoid is its derivative, commonly used in
training neural networks. The derivative of the function is defined as
f(x)(1-f(x)), where f(x) is the output of the function. The derivative is
helpful in training neural networks because it allows the network to adjust
the weights and biases of the neurons more efficiently.
 It’s also worth mentioning that the sigmoid function has some limitations.
For example, the output of the sigmoid is always between 0 and 1, which
can cause problems when the network’s output should be greater than 1
or less than 0. Other activation functions like ReLU and tanh can be used
in such cases.
 Visualizing the sigmoid function using graphs can help to understand its
properties better. Its graph will show the “S” shape of the function and
how the output value changes as the input value changes.

 It is a function which is plotted as ‘S’ shaped graph.

 Equation : A = 1/(1 + e-x)
 Nature : Non-linear. Notice that X values lies between -2 to 2, Y values
are very steep. This means, small changes in x would also bring about
large changes in the value of Y.
 Value Range : 0 to 1
 Uses : Usually used in output layer of a binary classification, where result
is either 0 or 1, as value for sigmoid function lies between 0 and 1 only
so, result can be predicted easily to be 1 if value is greater than 0.5 and 0
otherwise.

Tanh Activation Function

 The tanh function is just another possible function that can be used as a
non-linear activation function between layers of a neural network. It
shares a few things in common with the sigmoid activation function.
 Unlike a sigmoid function that will map input values between 0 and 1, the
Tanh will map values between -1 and 1.
 Like the sigmoid function, one of the interesting properties of the tanh
function is that the derivative of tanh can be expressed in terms of the
function itself.

 The activation that works almost always better than sigmoid function is
Tanh function also known as Tangent Hyperbolic function. It’s actually
mathematically shifted version of the sigmoid function. Both are similar
and can be derived from each other.
 Equation: -

 Value Range: - -1 to +1
 Nature: - non-linear
 Uses: - Usually used in hidden layers of a neural network as it’s values lies between -
1 to 1 hence the mean for the hidden layer comes out be 0 or very close to it, hence
helps in centring the data by bringing mean close to 0. This makes learning for the
next layer much easier.

RELU Activation Function

 A rectified linear unit (ReLU) is an activation function that introduces the
property of non-linearity to a deep learning model and solves the
vanishing gradients issue.
 It interprets the positive part of its argument. It is one of the most popular
activation functions in deep learning.
 ReLU formula is : f(x) = max(0,x)

As shown in figure, the ReLU is half rectified (from bottom). f(z) is zero when
z is less than zero and f(z) is equal to z when z is above or equal to zero.

ReLU activation function formula

Now how does ReLU transform its input? It uses this simple formula:

f(x)=max(0,x)
ReLU function is its derivative both are monotonic. The function returns 0 if it
receives any negative input, but for any positive value x, it returns that value
back. Thus, it gives an output that has a range from 0 to infinity.

Now let us give some inputs to the ReLU activation function and see how it
transforms them and then we will plot them also.

First, let us define a ReLU function

def ReLU(x):
if x>0:
return x
else:
return 0

Range: [ 0 to infinity)

The function and its derivative both are monotonic.

But the issue is that all the negative values become zero immediately which
decreases the ability of the model to fit or train from the data properly.

That means any negative input given to the ReLU activation function turns the
value into zero immediately in the graph, which in turns affects the resulting
graph by not mapping the negative values appropriately.

 It Stands for Rectified linear unit. It is the most widely used activation
function. Chiefly implemented in hidden layers of Neural network.
 Equation :- A(x) = max(0,x). It gives an output x if x is positive and 0
otherwise.
 Value Range :- [0, inf)
 Nature :- non-linear, which means we can easily backpropagate the
errors and have multiple layers of neurons being activated by the ReLU
function.
 Uses :- ReLu is less computationally expensive than tanh and sigmoid
because it involves simpler mathematical operations. At a time only a few
neurons are activated making the network sparse making it efficient and
easy for computation.

 In simple words, RELU learns much faster than sigmoid and Tanh
function.

Regularization in Deep Learning

 Regularization is a machine-learning strategy that avoids overfitting.
Overfitting happens when a model fits the training data too well and is
too complicated yet fails to function adequately on unobserved data.
 The model's loss function is regularized to include a penalty term, which
helps prevent the parameters from growing out of control and simplifies
the model.
 As a result, the model has a lower risk of overfitting and performs better
when applied to new data.
 When working with high-dimensional data, regularization is especially
crucial since it lowers the likelihood of overfitting and keeps the model
from becoming overly complicated.

What is regularization in machine learning?

 Regularization is a machine-learning approach that prevents overfitting
by including a penalty term into the model's loss function.
 Regularization has two objectives: to lessen a model's complexity and to
improve its ability to generalize to new inputs.
 Different penalty terms are added to the loss function using numerous
regularization methods, including L1 and L2 regularization.
 In contrast to L2 regularization, which adds a punishment term based on
the squares of the parameters, L1 regularization adds a penalty term based
on the absolute values of the model's parameters.
 Regularization decreases the chance of overfitting and helps keep the
model's parameters from going out of control, both of which can enhance
the model's performance on untested data.

What is L1 regularization?
 L1 regularization, also known as Lasso regularization, is a machine-
learning strategy that inhibits overfitting by introducing a penalty term
into the model's loss function based on the absolute values of the model's
parameters.
 L1 regularization seeks to reduce some model parameters toward zero in
order to lower the number of non-zero parameters in the model.
 L1 regularization is particularly useful when working with high-
dimensional data since it enables one to choose a subset of the most
important attributes.
 This lessens the risk of overfitting and makes the model easier to
understand. The size of a penalty term is controlled by the
hyperparameter lambda, which regulates the L1 regularization's
regularization strength.
 As lambda rises, more parameters will be lowered to zero, improving
regularization.
 L1 Regularization, also called a lasso regression, adds the “absolute value
of magnitude” of the coefficient as a penalty term to the loss function.
 Lasso Regression (Least Absolute Shrinkage and Selection Operator)
adds “absolute value of magnitude” of coefficient as penalty term to the
loss function.

What is L2 regularization?
 L2 regularization, also known as Ridge regularization, is a machine
learning technique that avoids overfitting by introducing a penalty term
into the model's loss function based on the squares of the model's
parameters.
 The goal of L2 regularization is to keep the model's parameter sizes short
and prevent oversizing.
 In order to achieve L2 regularization, a term that is proportionate to the
squares of the model's parameters is added to the loss function.
 This word works as a limiter on the parameters' size, preventing them
from growing out of control.
 A hyperparameter called lambda that controls the regularization's
intensity also controls the size of the penalty term. The parameters will be
smaller and the regularization will be stronger the greater the lambda.
 Ridge regression adds “squared magnitude” of coefficient as penalty
term to the loss function. Here the highlighted part represents L2
regularization element.
 Here, if lambda is zero then you can imagine we get back ordinary least
squares.
 However, if lambda is very large then it will add too much weight and it
will lead to under-fitting. Having said that it is important how lambda is
chosen. This technique works very well to avoid over-fitting issue.

Explosive Energy Distribution: Kleine Et Al (1993)
No ratings yet
Explosive Energy Distribution: Kleine Et Al (1993)
2 pages
Activation Function
No ratings yet
Activation Function
9 pages
Activation Functions
No ratings yet
Activation Functions
8 pages
Mod 2.3 - Activation Function
No ratings yet
Mod 2.3 - Activation Function
9 pages
UNIT-III Activation-function
No ratings yet
UNIT-III Activation-function
6 pages
Performance Analysis of Various Activation Functio
No ratings yet
Performance Analysis of Various Activation Functio
7 pages
Experiment No. 1 SL-II (ANN)
No ratings yet
Experiment No. 1 SL-II (ANN)
3 pages
Activation Function
No ratings yet
Activation Function
4 pages
Activation
No ratings yet
Activation
7 pages
Functii de Activare1
No ratings yet
Functii de Activare1
89 pages
activatn fn 2
No ratings yet
activatn fn 2
10 pages
Module1 - Upto Loss Function
No ratings yet
Module1 - Upto Loss Function
137 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
Activation Function
No ratings yet
Activation Function
31 pages
Activation Functions in Neural Networks - 241102 - 224129
No ratings yet
Activation Functions in Neural Networks - 241102 - 224129
7 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
Module1
No ratings yet
Module1
124 pages
4 4 Choosing The Right Activation Function For Neural Networks
No ratings yet
4 4 Choosing The Right Activation Function For Neural Networks
25 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
10 pages
Mod 2.3 - Activation Function, Loss Functions
No ratings yet
Mod 2.3 - Activation Function, Loss Functions
12 pages
5 TH
No ratings yet
5 TH
22 pages
Activation Functions
No ratings yet
Activation Functions
9 pages
Perceptron in Machine Learning
No ratings yet
Perceptron in Machine Learning
11 pages
Activation Function
No ratings yet
Activation Function
43 pages
activation fn
No ratings yet
activation fn
15 pages
ML_Lec-22
No ratings yet
ML_Lec-22
25 pages
Study of Ensemble of Activation Functions in Deep Learning
No ratings yet
Study of Ensemble of Activation Functions in Deep Learning
10 pages
Unit 5 Activation Function
No ratings yet
Unit 5 Activation Function
15 pages
Activation Functions in Neural Networks: What Is Activation Function?
No ratings yet
Activation Functions in Neural Networks: What Is Activation Function?
11 pages
Act_Fun
No ratings yet
Act_Fun
7 pages
4 - Activation Functions in Neural Networks
No ratings yet
4 - Activation Functions in Neural Networks
12 pages
Activation Function in NN
No ratings yet
Activation Function in NN
29 pages
Unit 2_Activation Function_PR
No ratings yet
Unit 2_Activation Function_PR
22 pages
Activation Function
No ratings yet
Activation Function
36 pages
Deep Learning: International Islamic University of Chittagong
No ratings yet
Deep Learning: International Islamic University of Chittagong
31 pages
Pr1_ANN_Writeup.docx
No ratings yet
Pr1_ANN_Writeup.docx
7 pages
Activation Function
No ratings yet
Activation Function
18 pages
Fundamentals Deep Learning Activation Functions When To Use Them
No ratings yet
Fundamentals Deep Learning Activation Functions When To Use Them
15 pages
Activation Functions
No ratings yet
Activation Functions
6 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
M2 PPT
No ratings yet
M2 PPT
84 pages
26- netinput activation function forward and back propogation
No ratings yet
26- netinput activation function forward and back propogation
41 pages
Presentation for deep learning
No ratings yet
Presentation for deep learning
15 pages
DL Answers
No ratings yet
DL Answers
24 pages
Ad3451 Ml Unit 4 Notes
No ratings yet
Ad3451 Ml Unit 4 Notes
34 pages
Ann
No ratings yet
Ann
40 pages
Lect 5- Non Linear Activation Functions
No ratings yet
Lect 5- Non Linear Activation Functions
41 pages
Aditya Jain NN Assignment
No ratings yet
Aditya Jain NN Assignment
13 pages
Activation Functions
No ratings yet
Activation Functions
3 pages
Types of Neural Network Activation Functions_ How to Choose_ (1)
No ratings yet
Types of Neural Network Activation Functions_ How to Choose_ (1)
36 pages
NN unit_1
No ratings yet
NN unit_1
27 pages
How To Choose An Activation Function For Deep Learning
No ratings yet
How To Choose An Activation Function For Deep Learning
15 pages
Activation Funtions
No ratings yet
Activation Funtions
26 pages
Activation Function: Deep Neural Networks
No ratings yet
Activation Function: Deep Neural Networks
47 pages
Machine Learning (CSO851) - Lecture 08
No ratings yet
Machine Learning (CSO851) - Lecture 08
27 pages
Deep Learning Tutorial 3
No ratings yet
Deep Learning Tutorial 3
12 pages
Feed Forward NN
No ratings yet
Feed Forward NN
35 pages
Activation functions 2
No ratings yet
Activation functions 2
5 pages
Activation Functions and Their Characteristics in Deep Neural Networks
No ratings yet
Activation Functions and Their Characteristics in Deep Neural Networks
6 pages
SoftComp 02
No ratings yet
SoftComp 02
33 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
MPPT
No ratings yet
MPPT
8 pages
Unit 2 Deep Learning
No ratings yet
Unit 2 Deep Learning
19 pages
UNIT 4 K-Means Clustring
No ratings yet
UNIT 4 K-Means Clustring
13 pages
Machine Learning Unit-1.1
No ratings yet
Machine Learning Unit-1.1
29 pages
Machine Learning Unit-1.2
No ratings yet
Machine Learning Unit-1.2
38 pages
Applied Linear Models with SAS 1st Edition Daniel Zelterman download
No ratings yet
Applied Linear Models with SAS 1st Edition Daniel Zelterman download
54 pages
Excel Filter
No ratings yet
Excel Filter
43 pages
Math 1 Exercise No. 3 Derivative of A Function
No ratings yet
Math 1 Exercise No. 3 Derivative of A Function
6 pages
[Ebooks PDF] download Modern Analysis and Applications Mark Krein Centenary Conference Volume 1 1st Edition Vadim Adamyan full chapters
100% (3)
[Ebooks PDF] download Modern Analysis and Applications Mark Krein Centenary Conference Volume 1 1st Edition Vadim Adamyan full chapters
80 pages
Computing The Hilbert Transform and Its Inverse PDF
No ratings yet
Computing The Hilbert Transform and Its Inverse PDF
28 pages
Markov Decision Process (MDP)
No ratings yet
Markov Decision Process (MDP)
31 pages
Notes DAA
No ratings yet
Notes DAA
15 pages
Exponential Functions: General Mathematics
100% (1)
Exponential Functions: General Mathematics
55 pages
Algebra One Lesson 8.2 Day One (3-2)
No ratings yet
Algebra One Lesson 8.2 Day One (3-2)
3 pages
Slides Section 13 - Basics of Kotlin Flow
No ratings yet
Slides Section 13 - Basics of Kotlin Flow
34 pages
V. B. S. Purvanchal University, Jaunpur: Syllabus B.A./B.Sc - Mathematics
No ratings yet
V. B. S. Purvanchal University, Jaunpur: Syllabus B.A./B.Sc - Mathematics
14 pages
Tut 003 E GearTransmission With PlanetaryDifferential PDF
No ratings yet
Tut 003 E GearTransmission With PlanetaryDifferential PDF
35 pages
SAT Suite Question Bank - Al Adv
No ratings yet
SAT Suite Question Bank - Al Adv
47 pages
Bachelor of Science in Computer Science: Singhania University
No ratings yet
Bachelor of Science in Computer Science: Singhania University
45 pages
12_06
No ratings yet
12_06
8 pages
Lecture 1 - Introduction To Engineering Optimization
100% (1)
Lecture 1 - Introduction To Engineering Optimization
57 pages
Palmer.05.Digital Processing of Shallow Seismic Refraction Data PDF
No ratings yet
Palmer.05.Digital Processing of Shallow Seismic Refraction Data PDF
18 pages
4.7 Composite Functions
100% (1)
4.7 Composite Functions
4 pages
SC2408-MPT2-MADS
No ratings yet
SC2408-MPT2-MADS
24 pages
Math IV Solution Set
No ratings yet
Math IV Solution Set
3 pages
Solutions Manual to accompany Calculus with Analytic Geometry 7th edition 9780618141807 - Read Directly Or Download With One Click
100% (14)
Solutions Manual to accompany Calculus with Analytic Geometry 7th edition 9780618141807 - Read Directly Or Download With One Click
31 pages
DAILY LESSON LOG OF M10AL-Ia-1 (Week One-Day One)
No ratings yet
DAILY LESSON LOG OF M10AL-Ia-1 (Week One-Day One)
5 pages
Lecture 7 Ss
No ratings yet
Lecture 7 Ss
17 pages
Concepts in Calculus II
No ratings yet
Concepts in Calculus II
240 pages
Robust Control Approach Space
No ratings yet
Robust Control Approach Space
497 pages
Review Exercises
No ratings yet
Review Exercises
6 pages
0341-ID-A Practical Guide For The Estimation of Uncertainty in Testing
No ratings yet
0341-ID-A Practical Guide For The Estimation of Uncertainty in Testing
33 pages
Learning Worksheet - Q2 - Module 8
No ratings yet
Learning Worksheet - Q2 - Module 8
8 pages
SHS Core - General Math CG PDF
No ratings yet
SHS Core - General Math CG PDF
5 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Unit 3 Deep Learning

Uploaded by

Unit 3 Deep Learning

Uploaded by

UNIT-3

 An activation function determines if a neuron should be activated or not

then, the activation would be (Activation function

Y = Activation function (∑ (weights*input + bias))

So, the activation function is an important part of an artificial neural network.

Types of Activation functions

1. Linear Activation Function

Linear Activation Function

Mathematically, it can be represented as:

Binary Step Activation Function

Non-linear Activation Functions

Non-linear activation functions allow the stacking of multiple layers of neurons,

Sigmoid Activation Function

 It is a function which is plotted as ‘S’ shaped graph.

Tanh Activation Function

RELU Activation Function

ReLU activation function formula

First, let us define a ReLU function

The function and its derivative both are monotonic.

Regularization in Deep Learning

What is regularization in machine learning?

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.