0% found this document useful (0 votes)
156 views

Lecture 6 - Convolution Neural Network (CNN)

The document discusses convolutional neural networks (CNNs) and their use in image classification. It explains that CNNs are more effective than fully connected neural networks for image classification tasks due to their use of local connectivity and weight sharing. The key layers used to build CNNs are convolutional layers, pooling layers, and fully connected layers. Convolutional layers apply filters to input images to extract features, pooling layers downsample the output to reduce dimensionality, and fully connected layers output class predictions.

Uploaded by

Đặng Anh Khoa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
156 views

Lecture 6 - Convolution Neural Network (CNN)

The document discusses convolutional neural networks (CNNs) and their use in image classification. It explains that CNNs are more effective than fully connected neural networks for image classification tasks due to their use of local connectivity and weight sharing. The key layers used to build CNNs are convolutional layers, pooling layers, and fully connected layers. Convolutional layers apply filters to input images to extract features, pooling layers downsample the output to reduce dimensionality, and fully connected layers output class predictions.

Uploaded by

Đặng Anh Khoa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Lecture 6

Convolutional Neural Network (CNN)

1
Convolutional Neural Network Lecture 6
Review of Fully Neural Network

HCM City Univ. of Technology, Faculty of Mechanical Engineering 2 Duong Van Tu


Convolutional Neural Network Lecture 6
Review of Fully Neural Network

• The CIFAR-10 dataset: The CIFAR-10 dataset consists of 60000 32x32 colour
images in 10 classes, with 6000 images per class.

• Images are only of size 32x32x3


(32 wide, 32 high, 3 color channels).

• A single fully-connected neuron of a first


hidden layer of a NN would have
32*32*3 = 3072 weights.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 3 Duong Van Tu


Convolutional Neural Network Lecture 6
Review of Fully Neural Network

• For example, an image of more respectable size, e.g. 200x200x3, would lead to
neurons that have 200*200*3 = 120,000 weights.
• This full connectivity is wasteful and the huge number of parameters would
quickly lead to overfitting.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 4 Duong Van Tu


Convolutional Neural Network Lecture 6
Convolutional Neural Network (ConvNet)

• A convolutional neural network is a feed-forward neural network that is generally


used to analyze visual images by processing data with grid-like topology. It’s also
known as a ConvNet. A convolutional neural network is used to detect and
classify objects in an image.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 5 Duong Van Tu


Convolutional Neural Network Lecture 6
Convolutional Neural Network (ConvNet)

How Does CNN Recognize Images

HCM City Univ. of Technology, Faculty of Mechanical Engineering 6 Duong Van Tu


Convolutional Neural Network Lecture 6
Convolutional Neural Network (ConvNet)

• The layers of a ConvNet have neurons arranged in 3 dimensions: width, height,


depth.
• The neurons in a layer will only be connected to a small region of the layer
before it, instead of all of the neurons in a fully-connected manner.
• The final output layer would for CIFAR-10 have dimensions 1x1x10 which is a
single vector of class scores.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 7 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Three main types of layers to build ConvNet architectures:


• Convolutional Layer
• Pooling Layer
• Fully-Connected Layer

Simple ConvNet for CIFAR-10 classification could have the architecture:

INPUT – CONV – RELU – POOL – FC

HCM City Univ. of Technology, Faculty of Mechanical Engineering 8 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

• INPUT [32x32x3] will hold the raw pixel values of the image.
Width 32, Height 32, and 3 channels R,G,B.
• CONV layer will compute the output of neurons that are connected to local
regions in the input, each computing a dot product between their weights and a
small region they are connected to in the input volume. This may result in
volume such as [32x32x12] if we decided to use 12 filters.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 9 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

• RELU layer will apply an elementwise activation function, such as the max(0,x)
thresholding at zero. This leaves the size of the volume unchanged
([32x32x12]).
• POOL layer will perform a downsampling operation along the spatial dimensions
(width, height), resulting in volume such as [16x16x12].
• FC (i.e. fully-connected) layer will compute the class scores, resulting in volume
of size [1x1x10], where each of the 10 numbers correspond to a class score,
such as among the 10 categories of CIFAR-10.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 10 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

• RELU layer will apply an elementwise activation function, such as the max(0,x)
thresholding at zero. This leaves the size of the volume unchanged
([32x32x12]).
• POOL layer will perform a downsampling operation along the spatial dimensions
(width, height), resulting in volume such as [16x16x12].
• FC (i.e. fully-connected) layer will compute the class scores, resulting in volume
of size [1x1x10], where each of the 10 numbers correspond to a class score,
such as among the 10 categories of CIFAR-10.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 11 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Convolution
Stride
Consider where we described a convolution operation as
“sliding” a small matrix across a large matrix, stopping at each
coordinate, computing an element-wise multiplication and sum,
then storing the output.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 12 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Convolution
Zero-padding
• We need to “pad” the borders of an image to retain the original image size when
applying a convolution.
• Using zero-padding, we can “pad” our input along the borders such that our
output volume size matches our input volume size.
• The amount of padding we apply is controlled by the parameter P.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 13 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Convolution
Zero-padding
• The output volume is smaller (3×3)
than the input volume (5×5).
• If we instead set P = 1, we can
pad our input volume with zeros to
create a 7× 7 volume.
• The output volume size that
matches the original input volume
size of 5× 5.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 14 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Convolution
• The input volume size (W)
• The receptive field size of the Conv Layer neurons (F).
• The stride with which they are applied (S)
• The amount of zero padding used (P) on the border.
• The output volume is calculated by the formular:

(W−F+2P)/S+1

For example for a 7x7 input and a 3x3 filter with stride 1 and pad 0 we would get
a 5x5 output.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 15 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Convolution

5x5
3x3
Wise
Element

HCM City Univ. of Technology, Faculty of Mechanical Engineering 16 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Convolution

HCM City Univ. of Technology, Faculty of Mechanical Engineering 17 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

HCM City Univ. of Technology, Faculty of Mechanical Engineering 18 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Pooling Layer
Pooling is a down-sampling operation that reduces the dimensionality of the
feature map. The rectified feature map now goes through a pooling layer to
generate a pooled feature map.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 19 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Pooling Layer

HCM City Univ. of Technology, Faculty of Mechanical Engineering 20 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Pooling Layer

HCM City Univ. of Technology, Faculty of Mechanical Engineering 21 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Pooling Layer

Average pooling

HCM City Univ. of Technology, Faculty of Mechanical Engineering 22 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Activation Layer

HCM City Univ. of Technology, Faculty of Mechanical Engineering 23 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Fattening
Flattening is used to convert all the resultant 2-Dimensional arrays from pooled
feature maps into a single long continuous linear vector.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 24 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets

Fattening
Flattening is used to convert all the resultant 2-Dimensional arrays from pooled
feature maps into a single long continuous linear vector.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 25 Duong Van Tu


Convolutional Neural Network Lecture 6
Layers used to build ConvNets
• The pixels from the image are fed to the convolutional layer that performs the convolution operation.
• It results in a convolved map.
• The convolved map is applied to a ReLU function to generate a rectified feature map.
• The image is processed with multiple convolutions and ReLU layers for locating the features.
• Different pooling layers with various filters are used to identify specific parts of the image.
• The pooled feature map is flattened and fed to a fully connected layer to get the final output.

HCM City Univ. of Technology, Faculty of Mechanical Engineering 26 Duong Van Tu

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy