0% found this document useful (0 votes)

6 views

Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman

Uploaded by

zofashane

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman

Uploaded by

zofashane

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 75

Convolutional Neural Networks

(Image Recognition)
Part - II

Dr. Syed M. Usman

1
Content
• s
Regularization Techniques
– L2 and L1 regularization
– Dropout
– Data augmentation
– Early stopping
• Batch Normalization
• Case Study: AlexNet
• Using Pre-Trained Nets
– Transfer Learning
– Fine Tuning

2
Regularization Techniques
• Objective: Avoiding overfitting

3
Overfittin
g

4
Regularization
• Regularization is a technique which makes slight
modifications to the learning algorithm such that the
model generalizes better.
• This in turn improves the model’s performance on the
unseen data as well.

Regularization is a technique to discourage the

complexity of the model. It does this by penalizing
the loss function.

5
Intuitive Explanation: Regression

Regularization
• Loss function for a linear regression with 4 input variables:

• As the degree of the input features increases the model becomes

complex and tries to fit all the data points:

6
Intuitive Explanation: Regression

Regularization
• When we penalize the weights θ3 and θ4 and make them too small,
very close to zero. It makes those terms negligible and helps
simplify the model

• Regularization works on assumption that smaller weights

generate simpler model and thus helps avoid overfitting.

7
Intuitive Explanation: Regression

Regularization
• We add the regularization term to the sum of squared
differences between the actual value and predicted value.
• Regularization term keeps the weights small making the model
simpler and avoiding overfitting

8
Intuitive Explanation: Regression

Regularization
• λ is the penalty term or regularization parameter which determines
how much to penalizes the weights.
• When λ is zero then the regularization term becomes zero. We are back
to the original Loss function.

• When λ is large, we penalizes the weights and they become close to

zero. This results is a very simple model having a high bias or is
underfitting.

9
Example - NN
• Consider a neural network which is overfitting
on the training data as shown in the image
below.

10
Example - NN
• Regularization penalizes the weight matrices of the nodes.
• Assume that our regularization coefficient is so high that some of the
weight matrices are nearly equal to zero.
• This will result in a much simpler linear network and slight underfitting
of the training data

11
Example - NN
• We need to optimize the value of regularization coefficient in
order to obtain a well-fitted model

12
Regularization Techniques
• L2 and L1 regularization
• Dropout
• Data augmentation
• Early stopping

13
L1 and L2 Regularization
• L1 and L2 are the most common types of regularization. These
update the general cost function by adding another term known as
the regularization term.
– Cost function = Loss (say, binary cross entropy) + Regularization
term
• Due to the addition of this regularization term, the values of weight
matrices decrease because it assumes that a neural network with
smaller weight matrices leads to simpler models.
• Therefore, it will also reduce overfitting to quite an extent.

14
L1 and L2 Regularization
• L2 Regularization
λ 2
Cost Function = Loss + 2 × Σ w

• L1 Regularization

Cost Function = Loss + λ × Σ w

• lambda is the regularization parameter.

• It is the hyperparameter whose value is optimized for better
results.

15
L1 and L2 Regularization

Note: The penalties (regularizes) are applied on a per-layer basis.

16
Dropout
• Dropout is by far the most popular regularization
technique for deep neural networks.
• During training time, at each iteration, a neuron is
temporarily “dropped” or disabled with probability p
• This means all the inputs and outputs to this neuron will
be disabled at the current iteration.
• The dropped-out neurons are resampled with probability
p at every training step, so a dropped out neuron at one
step can be active at the next one.
• The hyperparameter p is called the dropout-rate and it’s
typically a number around 0.5

17
Dropout
• Why dropout works?
– Dropout prevents the network to be too dependent on a
small number of neurons, and forces every neuron to be able to
operate independently
– Practically – Parameters linked to dropped neurons are not updated in
the iteration it was dropped

18
Dropout
• Let’s say you were the only person at your company who knows
about finance. If you were guaranteed to be at work every day, your
coworkers wouldn’t have an incentive to pick up finance skills.
• But if every morning you tossed a coin to decide whether you will
go to work or not, then your coworkers will need to adapt.
• Some days you might not be at work but finance tasks still need to
get done, so they can’t rely only at you. Your coworkers will need to
learn about finance and this expertise needs to be spread out
between various people.
• The workers need to cooperate with several other employees, not
with a fixed set of people.
• This makes the company much more resilient overall, increasing the
quality and skillset of the employees.

19
Dropout

Note: Dropout is applied to the fully-connected layers of ConvNet

only

20
Data Augmentation
• The simplest way to reduce overfitting is to increase the size
of the training data.

21
Data Augmentation Techniques
• A computer takes an image as an input, it will take in an array of
pixel values.
• Let’s say that the whole image is shifted left by 1 pixel. To us, this
change is imperceptible. However, to a computer, this shift can be
fairly significant as the classification or label of the image doesn’t
change, while the array does.
• Approaches that alter the training data in ways that change the
array representation while keeping the label the same are known
as data augmentation techniques.
• They are a way to artificially expand the dataset.

22
Data Augmentation

23
Data Augmentation

24
Data Augmentation
• Horizontal Flips

25
Data Augmentation
• Random Crops/Scales

26
Data Augmentation
• Color Jitter: Randomly jitter contrast

27
Data Augmentation
• Random Combinations of:
– Translation
– Rotation
– Stretching
– Shearing
– Lens distortion
– …..

28
Data Augmentation

29
Data Augmentation
• In keras, we can perform all of these transformations
using ImageDataGenerator.
• It has a big list of arguments which can be used to pre-process
training data.

30
Data Augmentation

31
Data Augmentation

32
Data Augmentation
• Saving copies of images to disk is not desirable
• Make copies and pass them directly to the learning algorithm

33
Data Augmentation
• Saving copies of images to disk is not desirable
• Make copies and pass them directly to the learning algorithm

• A batch of 128 images is picked, a random transformation is applied

from the set chosen, and 128 transformed images are passed to
learning algorithm.
• In the next epoch, some other random transformations are applied
– Size of data in each epoch does not increase
• In each epoch, the learning algorithm sees the same data but
transformed differently

34
Early Stopping
• Early stopping is a kind of cross-validation strategy where we keep
one part of the training set as the validation set.
• When we see that the performance on the validation set is getting
worse, we immediately stop the training on the model.

35
Early Stopping

monitor denotes the quantity that needs to be monitored and ‘val_err’

denotes the validation error.

patience denotes the number of epochs with no further improvement after which
the training will be stopped. After the dotted line, each epoch will result in a higher
value of validation error. Therefore, 5 epochs after the dotted line (since our
patience is equal to 5), our model will stop because no further improvement is seen.

36
Content
• s
Regularization Techniques
– L2 and L1 regularization
– Dropout
– Data augmentation
– Early stopping
• Batch Normalization
• Case Study: AlexNet
• Using Pre-Trained Nets
– Transfer Learning
– Fine Tuning

37
Batch Normalization
• Batch normalization improved the top result of ImageNet
(2014) by a significant margin using only 7% of the training
steps

38
Covariate Shift
• A binary classifier for roses: The output is 1 if the image is that of a
rose and the output is 0 otherwise.
• Consider a subset of the training data that primarily has red
rose buds as rose examples and wildflowers as non-rose
examples.

39
Covariate Shift
• Consider another subset that has fully blown roses of different colors as
rose examples and other non-rose flowers in the picture as non-rose
examples.

• Intuitively, it makes sense that every mini-batch used in the training

process should have the same distribution.
• A mini-batch should not have only images from one of the two subsets
above. It should have images randomly selected from both subsets in each
mini-batch.

40
Covariate Shift

The two subsets have very different distributions. The last column shows the distribution of
the two classes in the feature space using red and green dots. The blue line show the decision
boundary between the two classes.

41
Covariate Shift
• The natural way to solve this problem for the input layer is to randomize
the data before creating mini-batches.
• How do we solve this for the hidden layers?
• Each hidden unit’s input distribution changes every time there is a
parameter update in the previous layer.
• Since the activations of a previous layer are the inputs of the next
layer, each layer in the neural network is faced with a situation
where the input distribution changes with each step.
• This is called internal covariate shift and makes training slow
• This problem is solved by normalizing the layer’s inputs over a mini-
batch and this process is therefore called Batch Normalization.

42
Batch Normalization
• We normalize the input layer by adjusting and scaling the activations.
• For example, when we have features from 0 to 1 and some from 1 to
1000, we should normalize them to speed up learning.
• If the input layer is benefiting from it, why not do the same thing also
for the values in the hidden layers, that are changing all the time, and
get 10 times or more improvement in the training speed.
• To increase the stability of a neural network, batch
normalization normalizes the output of a previous activation
layer by subtracting the batch mean and dividing by the batch
standard deviation.

43
Why Normalize?

44
Batch Normalization
• The basic idea behind batch normalization is to limit covariate shift
by normalizing the activations of each layer (transforming the
inputs to be mean 0 and unit variance).
• This, supposedly, allows each layer to learn on a more stable
distribution of inputs, and would thus accelerate the training of the
network.
• In practice, restricting the activations of each layer to be
strictly 0 mean and unit variance can limit the expressive
power of the network.
• Therefore, in practice, batch normalization allows the network
to learn parameters gamma and beta that can convert the
mean and variance to any value that the network desires.

45
Batch Normalization
• Normalizing input values speeds up learning the parameters:

• Deeper Network
a 1 a 2 a 3

• Normalizing the activations a 2 can efficiently learn the parameters

W 3 ,b 3

46
Batch Normalization
• Can we normalize a 2 so as to learn W 3 , b 3 faster?
• For practical purposes, there is a debate on whether to normalize
a[2] or z[2] (Before or after activation), the later is employed in
practice (Andrew Ng)
• Steps:
– Given some intermediate values in the net for some hidden layer

– Compute (batch) mean: 1

μ = m Σ zi
i
2
– Compute (batch) variance: 1
σ2 = Σ zi − μ
m i
– Normalize zi: i
( z i − μ)
z norm = σ2 + є
The superscript i corresponds to the ith data in the mini batch
47
Batch Normalization
• The previous steps will ensure that the distribution of z in a layer
has zero mean and unit variance
• We do not want always want the hidden units to have zero mean
and unit variance
z˜i = γ. z i + β
norm

• Gamma and Beta represent learnable parameters of the model

• The effect of these parameter is that mean of z˜i can be what the
network thinks is the best

48
BN: Implementation
• Import the Batch Normalization module from keras

• Batch Normalization is added after the Conv2D or Dense

function calls, but before the following Activation function

49
Batch Normalization
• Evolution of loss with and without BN

50
Batch Normalization at Test Time
• Training: Data is provided 1-mini batch at a time
• Test: May have only one example at a time
• During training – For mini-batches (of size m):

• Mean and variance required for scaling computed from the whole
mini-batch

51
Batch Normalization at Test Time
• There is NO mini-batch at test time
• Need a different way of coming up with mean and variance – Mean
and variance of one example does not make sense
• Estimated using exponentially weighted average (across mini-
batches in the training set)
• During training consider layer l and a set of mini-batches:
– X 1 , X 2 , X 3 ,…. For each mini-batch we compute the corresponding
mean and variance values
– μ 1 l , μ 2 l , μ 3 l ,…..
– Exponentially weighted average of these becomes an estimate of mean for
layer l

52
Batch Normalization at Test Time
• At test time, instead of

i ( z i − μ)
z norm = σ2 + є

• Compute znorm for one test example:

znor m = (z − μ)
σ2 + є
• And using Beta and Gamma learnt during training, compute:
z˜ = γ. z n o r m + β

53
Case Study: AlexNet

54
Case Study: AlexNet
• Input: 227x227x3 images
• First layer (CONV1): 96 11x11 filters applied at
stride 4
• The output volume size?
– (227-11)/4+1 = 55

55
Output Size
Output Size

56
Case Study: AlexNet
• Input: 227x227x3 images
• First layer (CONV1): 96 11x11 filters applied at
stride 4
• Output Volume: 55 x 55 x 96
• Total Number of Parameters?
– Parameters: (11*11*3)*96 = 35K

57
Case Study: AlexNet
• Input: 227x227x3 images
• After CONV1: 55x55x96
• Second layer (POOL1): 3x3 filters applied at
stride 2
• Output volume?
– 27x27x96

58
Case Study: AlexNet
Full (simplified) AlexNet architecture:
[227x227x3] INPUT
[55x55x96] CONV1: 96 11x11 filters at stride 4, pad 0
[27x27x96] MAX POOL1: 3x3 filters at stride 2
[27x27x96] NORM1: Normalization layer
[27x27x256] CONV2: 256 5x5 filters at stride 1, pad 2
[13x13x256] MAX POOL2: 3x3 filters at stride 2
[13x13x256] NORM2: Normalization layer
[13x13x384] CONV3: 384 3x3 filters at stride 1, pad 1
[13x13x384] CONV4: 384 3x3 filters at stride 1, pad
1 [13x13x256] CONV5: 256 3x3 filters at stride 1,
pad 1 [6x6x256] MAX POOL3: 3x3 filters at stride 2
[4096] FC6: 4096 neurons
[4096] FC7: 4096 neurons
[1000] FC8: 1000 neurons (class scores)

59
Using Pre-Trained ConvNets
(Transfer Learning)

60
Using Pre-Trained Nets
• A Typical CNN has two parts:
– Convolutional base: which is composed by a stack of
convolutional and pooling layers. The main goal of the
convolutional base is to generate features from the
image.
– Classifier: which is usually composed by fully
connected layers. The main goal of the classifier is to
classify the image based on the detected features. A
fully connected layer is a layer whose neurons have full
connections to all activation in the previous layer.

61
Using Pre-Trained Nets
• Deep learning models can automatically learn hierarchical feature
representations.
• Features computed by the first layer are general and can be reused
in different problem domains, while features computed by the last
layer are specific and depend on the chosen dataset and task
• A common misconception in the DL community is that
without a Google-esque amount of data, you can’t possibly
hope to create effective deep learning models.
• While data is a critical part of creating the network, the idea
of transfer learning has helped to lessen the data demands.
• Transfer Learning: Taking a pre-trained model and adapting
it to a given problem.

62
Using Pre-Trained Nets
• Strategy I: Train the entire model
– In this case, you use the architecture of the pre-trained
model and train it according to your dataset. You’re
learning the model from scratch, so you’ll need a large
dataset (and a lot of computational power).

63
Using Pre-Trained Nets
• Strategy II: Fine-Tune a pre-trained model
– Change the fully connected layer of the network to match
the data under study
– Continue back propagation and update parameters of all
or a subset of layers (Initial layers can be frozen)

64
Using Pre-Trained Nets
• It is possible to fine-tune all the layers of the ConvNet, or it’s
possible to keep some of the earlier layers fixed (due to overfitting
concerns) and only fine-tune some higher-level portion of the
network.
• Earlier features of a ConvNet contain more generic features (e.g.
edge detectors or color blob detectors) that should be useful to
many tasks, but later layers of the ConvNet becomes progressively
more specific to the details of the classes contained in the original
dataset.

65
Using Pre-Trained Nets
• Strategy III: Use CNN as Feature Extractor
– Freeze the convolutional base
– Pass the data through network and use the output of
convolutional base as features
– Feed features to another classifier
– Example:
• For AlexNet: 4096-D vector for every image that contains
the activations of the hidden layer immediately before the
classifier.
• Train Classifier on these features (e.g. SVM)

66
Summary: Using Pre-Trained ConvNets

Credits: Transfer Learning using Pre-trained models, Pedro Marcelino

67
Using Pre-Trained Nets: Possibilities

Credits: Transfer Learning using Pre-trained models, Pedro Marcelino

68
When to use What?

Credits: Transfer Learning using Pre-trained models, Pedro Marcelino

69
Implementation: Transfer Learning
• Load the model:

• Load Data and Labels

– Data must be 224x224x3

70
Implementation: Transfer Learning
• Convert labels to one hot encoding

• Create tensors to store features

71
Implementation: Transfer Learning
• Pass images through network using predict
function to get features

• Employ any classifier: Feed it with training

features and respective labels

72
Implementation: Fine Tuning
• Load the model

• Freeze the initial layers

73
Implementation: Fine Tuning
• Create new model: Add classification layers on
top of convolutional base

74
References
• The material in these slides has been taken from the following
sources.
– Slides by CS231n Winter 2016 – Andrej Karpathy
– Convolutional Neural Networks (CNNs): An Illustrated
Explanation, Abhineet Saxena
– An Intuitive Explanation of Convolutional Neural
Networks
– A Beginner's Guide To Understanding Convolutional Neural
Networks, Adit Deshpande

K3E Tech Manual
0% (1)
K3E Tech Manual
142 pages
Data Sheet HS40 v1.00 - Rev01
67% (3)
Data Sheet HS40 v1.00 - Rev01
22 pages
Spreadsheets To Eurocode 2: M:N Interaction Chart For 800 X 1,000 Section, Grade 40 Concrete
No ratings yet
Spreadsheets To Eurocode 2: M:N Interaction Chart For 800 X 1,000 Section, Grade 40 Concrete
7 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
16 pages
WEEK 10
No ratings yet
WEEK 10
69 pages
Deep Neural Network Module 4 Regularization
No ratings yet
Deep Neural Network Module 4 Regularization
53 pages
cours4
No ratings yet
cours4
30 pages
Unit-2 Improving-Deep-Neural-Networks
No ratings yet
Unit-2 Improving-Deep-Neural-Networks
18 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
26 pages
DL Class3
No ratings yet
DL Class3
28 pages
UNIT-II Regularization in Deep Learning
No ratings yet
UNIT-II Regularization in Deep Learning
24 pages
DL UNIT 3
No ratings yet
DL UNIT 3
14 pages
Module-4_4
No ratings yet
Module-4_4
19 pages
NN 08
No ratings yet
NN 08
36 pages
Dataset Augmentation
No ratings yet
Dataset Augmentation
30 pages
03 Reg Slides
No ratings yet
03 Reg Slides
64 pages
465-Lecture 10-11
No ratings yet
465-Lecture 10-11
79 pages
Deep-Learning-Module-2-Important-Topics-PYQs
No ratings yet
Deep-Learning-Module-2-Important-Topics-PYQs
30 pages
Practical Aspects of Deep Learning PI
No ratings yet
Practical Aspects of Deep Learning PI
46 pages
tutorial 4
No ratings yet
tutorial 4
6 pages
Regularization and Normalization
No ratings yet
Regularization and Normalization
29 pages
What is Regularization.
No ratings yet
What is Regularization.
10 pages
BACK PROPAGATION and REGULATION, BATCH NORMALIZATION
No ratings yet
BACK PROPAGATION and REGULATION, BATCH NORMALIZATION
20 pages
2. Deep Neural Network
No ratings yet
2. Deep Neural Network
60 pages
DL+lect+7 (1)
No ratings yet
DL+lect+7 (1)
15 pages
Lecture 5-6
No ratings yet
Lecture 5-6
45 pages
9.b Handout-2-Regularization
No ratings yet
9.b Handout-2-Regularization
5 pages
Hyperparameters
No ratings yet
Hyperparameters
15 pages
4. Regularization
No ratings yet
4. Regularization
19 pages
Regularization Slides (2)
No ratings yet
Regularization Slides (2)
50 pages
Unit Online 1.4
No ratings yet
Unit Online 1.4
132 pages
Chap 2 Training Feed Forward Neural Networks
No ratings yet
Chap 2 Training Feed Forward Neural Networks
22 pages
DL_IT324a_3
No ratings yet
DL_IT324a_3
13 pages
Lecture 1 Part II
No ratings yet
Lecture 1 Part II
24 pages
Deep Feedforward Networks and Regularization: Licheng Zhang
No ratings yet
Deep Feedforward Networks and Regularization: Licheng Zhang
56 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
Supervised Deep Learning
No ratings yet
Supervised Deep Learning
28 pages
Unit II.
No ratings yet
Unit II.
14 pages
Lec 05 Regularization
No ratings yet
Lec 05 Regularization
77 pages
UNIT-III DLL full unit
No ratings yet
UNIT-III DLL full unit
63 pages
Regularization_for_Neural_Networks_1718966083
No ratings yet
Regularization_for_Neural_Networks_1718966083
9 pages
Regularization
No ratings yet
Regularization
3 pages
4 NN Regularization
No ratings yet
4 NN Regularization
13 pages
ANN_Presentation_Exam_Hafsa
No ratings yet
ANN_Presentation_Exam_Hafsa
29 pages
Underfitting Overfitting
No ratings yet
Underfitting Overfitting
38 pages
Deep Learning Module-03 Search Creators
No ratings yet
Deep Learning Module-03 Search Creators
20 pages
S10_DNN_Regularization_wip
No ratings yet
S10_DNN_Regularization_wip
11 pages
A Probabilistic Theory of Deep Learning: Unit 2
No ratings yet
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
A Quick Guide On Basic Regularization Methods For Neural Networks - by Jaime Durán - Yottabytes - Medium
No ratings yet
A Quick Guide On Basic Regularization Methods For Neural Networks - by Jaime Durán - Yottabytes - Medium
9 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
20 pages
Deep Learning Basics Lecture 4 Regularization II
No ratings yet
Deep Learning Basics Lecture 4 Regularization II
27 pages
Early Stopping, Dropout, Augmentation, Optimizers New
No ratings yet
Early Stopping, Dropout, Augmentation, Optimizers New
91 pages
IoT - Lecture 11
No ratings yet
IoT - Lecture 11
58 pages
Neural Networks For Machine Learning: Lecture 9a Overview of Ways To Improve Generalization
No ratings yet
Neural Networks For Machine Learning: Lecture 9a Overview of Ways To Improve Generalization
39 pages
Computer Vision NN Architecture
No ratings yet
Computer Vision NN Architecture
19 pages
5m DL answers
No ratings yet
5m DL answers
12 pages
LECTURE#9 EE258 F22 Part2 Draft v1
No ratings yet
LECTURE#9 EE258 F22 Part2 Draft v1
14 pages
Machine Learning (ML) :: Aim: Analysis and Implementation of Deep Neural Network. Definitions
No ratings yet
Machine Learning (ML) :: Aim: Analysis and Implementation of Deep Neural Network. Definitions
6 pages
L10 Learning II Gradient Based Learning
No ratings yet
L10 Learning II Gradient Based Learning
72 pages
2 Deep Neural Network_241120_095158
No ratings yet
2 Deep Neural Network_241120_095158
47 pages
Deep Learning UNIT-II Part1
No ratings yet
Deep Learning UNIT-II Part1
48 pages
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
From Everand
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
Idea Link
No ratings yet
Artificial Intelligence Interview Questions
From Everand
Artificial Intelligence Interview Questions
Tech Interviews
5/5 (2)
Chapter 2 - Data Base
No ratings yet
Chapter 2 - Data Base
14 pages
S805 Tangent Array Modules: Coupling Plane Coverage (ϒ) Non-Coupling Plane Coverage (ϒ)
No ratings yet
S805 Tangent Array Modules: Coupling Plane Coverage (ϒ) Non-Coupling Plane Coverage (ϒ)
2 pages
Numerical Solve
No ratings yet
Numerical Solve
9 pages
EE675A Lecture 16
No ratings yet
EE675A Lecture 16
6 pages
LG PPT
No ratings yet
LG PPT
18 pages
Mobile Computing EEM 825/ PEE411 Credits:4: Syllabus
100% (1)
Mobile Computing EEM 825/ PEE411 Credits:4: Syllabus
6 pages
INTERRUPT
No ratings yet
INTERRUPT
8 pages
M.util - Automatic Burner Control Mpa 41xx
No ratings yet
M.util - Automatic Burner Control Mpa 41xx
126 pages
Metamorphic Malware
No ratings yet
Metamorphic Malware
12 pages
N Queen
No ratings yet
N Queen
32 pages
Installing SQL Server 2008 R2
No ratings yet
Installing SQL Server 2008 R2
23 pages
A Rolling Horizon Heuristic For Reactive Scheduling of Batch Process Operations
No ratings yet
A Rolling Horizon Heuristic For Reactive Scheduling of Batch Process Operations
31 pages
MN603-Wireless Network and Security: Assignment 1A
No ratings yet
MN603-Wireless Network and Security: Assignment 1A
5 pages
Program Manager Senior Project Manager in Washington DC Resume Dana Woodward
No ratings yet
Program Manager Senior Project Manager in Washington DC Resume Dana Woodward
2 pages
XII-IP-QuickRevision 2 in 1
No ratings yet
XII-IP-QuickRevision 2 in 1
13 pages
Door Schedule
No ratings yet
Door Schedule
6 pages
Unit5 Lecture1 PDF
No ratings yet
Unit5 Lecture1 PDF
46 pages
Maple 9 Advanced Programming Guide PDF
No ratings yet
Maple 9 Advanced Programming Guide PDF
454 pages
Dox - Google Search
No ratings yet
Dox - Google Search
1 page
University of Mumbai: WWW - Mu.ac - In/idol WWW - Mahaonline.gov - in WWW - Mu.ac - In/idol
No ratings yet
University of Mumbai: WWW - Mu.ac - In/idol WWW - Mahaonline.gov - in WWW - Mu.ac - In/idol
2 pages
Vhf
No ratings yet
Vhf
2 pages
Power Flows
No ratings yet
Power Flows
7 pages
Pa600 1.10 Upgrade Manual (English) (Web)
No ratings yet
Pa600 1.10 Upgrade Manual (English) (Web)
6 pages
P06 Identification and Traceability
No ratings yet
P06 Identification and Traceability
3 pages
Data-Driven Modelling. Exercise ANN-1. Prediction of Flow Using Artificial Neural Networks (ANN)
No ratings yet
Data-Driven Modelling. Exercise ANN-1. Prediction of Flow Using Artificial Neural Networks (ANN)
9 pages
Welcome! - Reminders On Your Day 1
No ratings yet
Welcome! - Reminders On Your Day 1
2 pages
Operation Manual: Precision Counting Scale
No ratings yet
Operation Manual: Precision Counting Scale
8 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman

Uploaded by

Convolutional Neural Networks (Image Recognition) Part - II: Dr. Syed M. Usman

Uploaded by

Convolutional Neural Networks

Dr. Syed M. Usman

Regularization is a technique to discourage the

• As the degree of the input features increases the model becomes

• Regularization works on assumption that smaller weights

• When λ is large, we penalizes the weights and they become close to

Cost Function = Loss + λ × Σ w

• lambda is the regularization parameter.

Note: The penalties (regularizes) are applied on a per-layer basis.

Note: Dropout is applied to the fully-connected layers of ConvNet

• A batch of 128 images is picked, a random transformation is applied

monitor denotes the quantity that needs to be monitored and ‘val_err’

• Intuitively, it makes sense that every mini-batch used in the training

• Normalizing the activations a 2 can efficiently learn the parameters

– Compute (batch) mean: 1

• Gamma and Beta represent learnable parameters of the model

• Batch Normalization is added after the Conv2D or Dense

• Compute znorm for one test example:

Credits: Transfer Learning using Pre-trained models, Pedro Marcelino

Credits: Transfer Learning using Pre-trained models, Pedro Marcelino

Credits: Transfer Learning using Pre-trained models, Pedro Marcelino

• Load Data and Labels

• Create tensors to store features

• Employ any classifier: Feed it with training

• Freeze the initial layers

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.