0% found this document useful (0 votes)
1 views13 pages

Class Notes Unit 5

The document provides an overview of neural networks, focusing on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), including their architectures, functionalities, and applications. It explains the structure of CNNs, detailing layers such as input, convolutional, pooling, and fully connected layers, as well as the significance of RNNs in handling sequential data. Additionally, it introduces PyTorch Tensors, highlighting their role in deep learning and how to create them using various methods.

Uploaded by

madhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views13 pages

Class Notes Unit 5

The document provides an overview of neural networks, focusing on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), including their architectures, functionalities, and applications. It explains the structure of CNNs, detailing layers such as input, convolutional, pooling, and fully connected layers, as well as the significance of RNNs in handling sequential data. Additionally, it introduces PyTorch Tensors, highlighting their role in deep learning and how to create them using various methods.

Uploaded by

madhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Neural Network and Representation Learing,

Convolutional Layers, Multichannel Convolution Operation, Net works


Introduction to RNN, RNN Code, PyTorch Tensors: Deep Learning with PyTorch, CNN in
PyTorch

A is a type of Deep Learning neural network


architecture commonly used in Computer Vision. Computer vision is a field of Artificial Intelligence
that enables a computer to understand and interpret the image or visual data.
When it comes to Machine Learning, Artificial Neural Networks perform really well. Neural
Networks are used in various datasets like images, audio, and text. Different types of Neural
Networks are used for different purposes, for example for predicting the sequence of words we
use more precisely an LSTM, similarly for image
classification we use Convolution Neural networks. In this blog, we are going to build a basic
building block for CNN.

In a regular Neural Network there are three types of layers:

1. It’s the layer in which we give input to our model. The number of
neurons in this layer is equal to the total number of features in our data (number of pixels
in the case of an image).
2. The input from the Input layer is then feed into the hidden layer. There can
be many hidden layers depending upon our model and data size. Each hidden layer can have
different numbers of neurons which are generally greater than the number of features. The
output from each layer is computed by matrix multiplication of output of the previous layer
with learnable weights of that layer and then by the addition of learnable biases followed by
activation function which makes the network nonlinear.
3. The output from the hidden layer is then fed into a logistic function like
sigmoid or softmax which converts the output of each class into the probability score of each
class.
The data is fed into the model and output from each layer is obtained from the above step is called
, we then calculate the error using an error function, some common error
functions are cross-entropy, square loss error, etc. The error function measures how well the
network is performing.
After that, we backpropagate into the model by calculating the derivatives. This step is called
which basically is used to minimize the loss.

Convolutional Neural Network consists of multiple layers like the input layer, Convolutional layer,
Pooling layer, and fully connected layers.
The Convolutional layer applies filters to the input image to extract features, the Pooling layer
down samples the image to reduce computation, and the fully connected layer makes the final
prediction. The network learns the optimal filters through backpropagation and gradient
descent.

Convolution Neural Networks or converts are neural networks that share their parameters.
Imagine you have an image. It can be represented as a cuboid having its length, width (dimension
of the image), and height (i.e the channel as images generally have red, green, and blue channels).

Now imagine taking a small patch of this image and running a small neural network, called a
filter or kernel on it, with say, K outputs and representing them vertically. Now slide that neural
network across the whole image, as a result, we will get another image with different widths,
heights, and depths. Instead of just R, G, and B channels now we have more channels but lesser
width and height. This operation is called . If the patch size is the same as that of
the image it will be a regular neural network. Because of this small patch, we have fewer weights.
 Convolution layers consist of a set of learnable filters (or kernels) having small widths and
heights and the same depth as that of input volume (3 if the input layer is image input).
 For example, if we have to run convolution on an image with dimensions 34x34x3. The
possible size of filters can be axax3, where ‘a’ can be anything like 3, 5, or 7 but smaller as
compared to the image dimension.
 During the forward pass, we slide each filter across the whole input volume step by step
where each step is called stride (which can have a value of 2, 3, or even 4 for high-
dimensional images) and compute the dot product between the kernel weights and patch
from input volume.
 As we slide our filters we’ll get a 2-D output for each filter and we’ll stack them together
as a result, we’ll get output volume having a depth equal to the number of filters. The
network will learn all the filters.

A complete Convolution Neural Networks architecture is also known as covets. A covets is a


sequence of layers, and every layer transform one volume to another through a differentiable
function.
datasets
Let’s take an example by running a convert on of image of dimension 32 x 32 x 3.
 It’s the layer in which we give input to our model. In CNN, Generally, the
input will be an image or a sequence of images. This layer holds the raw input of the image
with width 32, height 32, and depth 3.

 This is the layer, which is used to extract the feature from the
input dataset. It applies a set of learnable filters known as the kernels to the input images.
The filters/kernels are smaller matrices usually 2×2, 3×3, or 5×5 shape. it slides over the
input image data and computes the dot product between kernel weight and the corresponding
input image patch. The output of this layer is referred ad feature maps. Suppose we use a
total of 12 filters for this layer we’ll get an output volume of dimension 32 x 32 x 12.

 By adding an activation function to the output of the preceding layer,


activation layers add nonlinearity to the network. it will apply an element-wise activation
function to the output of the convolution layer. Some common activation functions are
: max(0, x), , , etc. The volume remains unchanged hence output
volume will have dimensions 32 x 32 x 12.

 This layer is periodically inserted in the covnets and its main function is to
reduce the size of volume which makes the computation fast reduces memory and also
prevents overfitting. Two common types of pooling layers are and average
pooling. If we use a max pool with 2 x 2 filters and stride 2, the resultant volume will be of
dimension 16x16x12.
 The resulting feature maps are flattened into a one- dimensional vector
after the convolution and pooling layers so they can be passed into a completely linked
layer for categorization or regression.
 It takes the input from the previous layer and computes the
final classification or regression task.
 The output from the fully connected layers is then fed into a logistic
function for classification tasks like sigmoid or soft max which converts the output of each
class into the probability score of each class.

1. Good at detecting patterns and features in images, videos, and audio signals.
2. Robust to translation, rotation, and scaling invariance.
3. End-to-end training, no need for manual feature extraction.
4. Can handle large amounts of data and achieve high accuracy.

Computationally expensive to train and require a lot of memory.


1. Can be prone to overfitting if not enough data or proper regularization is used.
2. Requires large amounts of labeled data.
3. Interpretability is limited, it’s hard to understand what the network has learned.

is a type of Neural Network where


the . In
traditional neural networks, all the inputs and outputs are independent of each other, but in cases
like when it is required to predict the next word of a sentence, the previous words are required and
hence there is a need to remember the previous words. Thus RNN came into existence, which
solved this issue with the help of a Hidden Layer. The main and most important feature of RNN
is , which remembers some information about a sequence.
RNN have a which remembers all information about what has been calculated. It
uses the same parameters for each input as it performs the same task on all the inputs or hidden
layers to produce the output. This reduces the complexity of parameters, unlike other neural
networks.

The working of an RNN can be understood with the help of the below example:
Suppose there is a deeper network with one input layer, three hidden layers, and one
output layer. Then like other neural networks, each hidden layer will have its own set of weights
and biases, let’s say, for hidden layer 1 the weights and biases are (w1, b1), (w2, b2) for the
second hidden layer, and (w3, b3) for the third hidden layer. This means that each of these layers
is independent of the other, i.e. they do not memorize the previous outputs.

Now the RNN will do the following:


 RNN converts the independent activations into dependent activations by providing the same
weights and biases to all the layers, thus reducing the complexity of increasing parameters
and memorizing each previous output by giving each output as input to the next hidden layer.
 Hence these three layers can be joined together such that the weights and bias of all the hidden
layers are the same, in a single recurrent layer.

ht -> current state


ht-1 -> previous state
xt -> input state

whh -> weight at recurrent neuron wxh ->


weight at input neuron

Yt -> output
Why -> weight at output layer

A single-time step of the input is provided to the network.


1. Then calculate its current state using a set of current input and the previous state.
2. The current ht becomes ht-1 for the next time step.
3. One can go as many time steps according to the problem and join the information from all
the previous states.
4. Once all the time steps are completed the final current state is used to calculate the output.
5. The output is then compared to the actual output i.e the target output and the error is
generated.
6. The error is then back-propagated to the network to update the weights and hence
the network (RNN) is trained.
1. An RNN remembers each and every piece of information through time. It is useful in time
series prediction only because of the feature to remember previous inputs as well. This is
called Long Short Term Memory.
2. Recurrent neural networks are even used with convolutional layers to extend the effective
pixel neighborhood.

1. Gradient vanishing and exploding problems.


2. Training an RNN is a very difficult task.
3. It cannot process very long sequences if using tanh or relu as an activation function.

1. Language Modelling and Generating Text


2. Speech Recognition
3. Machine Translation
4. Image Recognition, Face detection
5. Time series Forecasting

What is meant by RNN code

RNN code refers to the code implementation of a Recurrent Neural


Network (RNN) algorithm. RNNs are a type of artificial neural network that can handle sequential
data, such as time series data or natural language text. The code for an RNN will typically involve
defining the neural network architecture, specifying the input data and target data, compiling the
model with an appropriate optimizer and loss function, and training the model on the input and target
data.

The specific code for an RNN can vary depending on the programming language and framework
being used. For example, in Python with TensorFlow or Keras, the code may involve defining a
SimpleRNN layer, specifying the number of hidden units, sequence length, and input dimension.
Other types of RNN layers, such as LSTM or GRU, may also be used depending on the requirements
of the problem being solved.
Ultimately, the RNN code should be able to take in input data, process it through the RNN layers, and
output a prediction or classification result.
RNN Code

from keras.models import Sequential

from keras.layers import Dense, SimpleRNN model = Sequential()

model.add(SimpleRNN(64,input_shape=(10,1)))

model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

X_train = ... # Input data, shape (num_samples, sequence_length, input_dim) y_train = ... # Target

data, shape (num_samples,)

model.fit(X_train, y_train, epochs=10, batch_size=32)

X_test = ... # New input data, shape (num_samples, sequence_length, input_dim)

y_pred = model.predict(X_test)

In this example, we're using Keras, a high-level neural networks API, to define, compile, and train
our RNN model. We're also using a SimpleRNN layer with 64 hidden units, which takes input
sequences of length 10 and one-dimensional input data. We're then adding a Dense output layer
with one neuron and a sigmoid activation function, since we're dealing with a binary classification
problem (i.e., the target data is either 0 or 1). Finally, we're compiling the model with binary cross-
entropy as the loss function and the Adam optimizer, and training it on our input and target data
using a batch size of 32 and for 10 epochs. Once the model is trained, we can use it to make
predictions on new input data using the predict method.

Introduction to PyTorch Tensors

The following article provides an outline for PyTorch Tensors. PyTorch was released as an open-
source framework in 2017 by Facebook, and it has been very popular among developers and the
research community. PyTorch has made building deep neural network models by providing easy
programming and faster computation. However, PyTorch’s strong feature is providing Tensors.
Tensors are defined as single dimensions or a matrix of a multi-dimensional array containing an
element of single data types.

Tensors are also used in the Tensorflow framework, which Google released. NumPy Arrays in
Python are basically just tensors processed by using GPUs or TPUs for training neural network
models. PyTorch has libraries included in it for calculating gradient descent for feed-forward
networks as well as back-propagation. PyTorch has more support to the Python libraries like NumPy
and Scipy compared to other frameworks like Tensorflow.
PyTorch Tensors Dimensions

In any linear algebraic operations, the user may have data in vector, matrix or N-dimensional form.
Vector is basically a single-dimensional tensor, Matrix is two-dimensional tensors, and an Image is
a 3-dimensional tensor with RGB as a dimension. PyTorch tensor is a multi-dimensional array,
same as NumPy and also it acts as a container or storage for the number. To create any neural
network for a deep learning model, all linear algebraic operations are performed on Tensors to
transform one tensor to new tensors.

PyTorch tensors have been developed even though there was NumPy array offering multi-
dimensional array property, but PyTorch has the advantage of running on top of GPU and also
tensors can integrate with Python libraries like NumPy, Scikit learns and pandas. Tensors store the
data in the form of an array, and records can be accessed by using an index. Tensors represent many
types of data which have arbitrary dimensions like images,audio or time-series data.

How to Create PyTorch Tensors Using Various Methods

Let’s create a different PyTorch tensor before creating any tensor import torch class using the below
command:

Code: import torch

1. Create tensor from pre-existing data in list or sequence form using torch class.
It is a 2*3 matrix with values as 0 and 1.

Syntax:

torch.tensor(data, dtype=None, device=None, requires_grad=False, pin_memory=False)

Code: import torch tensor_b = torch.Tensor([[0,0,0], [1,1,1]]) tensor_b

2. Create n*m tensor from random function in the torch.

3. Creating a tensor from numerical types using functions such as ones and zeros.
Syntax:

torch.zeros(data_size, dtype=input.dtype, layout=input.layout, device=input.device)


Code:

tensor_d = torch.zeros(3, 3) tensor_d

In the above, tensor .zeros() is used to create a 3*3 matrix with all the values as ‘0’ (zero).

4. Creating a PyTorch tensor from the numpy tensor.


To create a tensor from numpy, create an array using numpy and then convert it to tensor using
the .as_tensor keyword.

Syntax:

torch.as_tensor(data, dtype=None, device=none)

5. Creating new tensors by applying transformation on existing tensors.


Here is the basic tensor operation to perform the matrix product and get a new tensor.

Code:

tensor_e = torch.Tensor([[1, 2], [7, 8]]) tensor_f = torch.Tensor([[10], [20]]) tensor_mat =


tensor_e.mm(tensor_f) tensor_mat

Parameters:

Here is the list and information on parameters used in syntax: data: Data for tensors.

dtype: Datatype of the returned tensor.

device: Device used is CPU or CUDA device with returned tensor.

requires_grad: It is a boolean data type with values as True or False to record automatic gradient
on returned tensor.

data_size: Data shape of the input tensor.

pin_memory: If the pin_memory is set to Truly returned tensor will have pinned memory.

Following are some of the key important points of tensors in PyTorch:

1. Tensors are important in the PyTorch as it is a fundamental data structure and all the neural
network models are built using tensors as it has the ability to perform linear algebra operations

2. Tensors are similar to numpy arrays, but they are way more powerful than the numpy array as
They perform their computation GPU or CPU. Hence, It is way more faster than the numpy library
of python.

3. It offers seamless interoperability with Python libraries so that the programmer can easily use
Sci-kit, SciPy libraries with tensors. Also, using functions like as_tensors or from_numpy
programmer can easily convert the numpy array to PyTorch tensors.
4. One of the important features offered by tensor is it can store track of all the operations
performed on them, which helps to compute the gradient descent of output; this can be done using
Autograd functionality of tensors.
5. It is a multi-dimensional array which holds data for Images that can be converted into a 3-
dimensional array based on its color like RGB (Red, Green and Blue); also, it holds Audio data
or Time series data; any unstructured data can be addressed using tensors.

Deep Learning with Pytorch

PyTorch is a popular open-source machine learning framework that is widely used for deep
learning applications. It is developed by Facebook's AI research team and provides a fast and
flexible way to develop deep learning models.

Here are the basic steps to get started with deep learning using PyTorch:

1. Install PyTorch: You can install PyTorch using pip or conda depending on your environment.
Visit the PyTorch website for detailed installation instructions.

2. Load and preprocess data: PyTorch provides several tools to load and preprocess data, including
the DataLoader class, which allows you to batch and shuffle your data.

3. Define your model: In PyTorch, you can define your model using the nn module, which
provides a wide range of pre-built layers and modules. You can also create your custom layers
and modules.

4. Train your model: PyTorch provides several tools for training your model, including loss
functions, optimizers, and schedulers. You can also write custom training loops using PyTorch's
autograd engine.

5. Evaluate your model: Once you have trained your model, you can evaluate its performance on a
validation set using metrics such as accuracy or F1 score.

6. Save and load your model: PyTorch provides tools for saving and loading trained models so that
you can use them for inference or continue training later.

In addition to these basic steps, PyTorch provides several advanced features for deep learning,
including distributed training, GPU acceleration, and automatic differentiation. PyTorch also has
an active community that provides tutorials, examples, and support.
CNN in PyTorch Implementation of

PyTorch

Following steps are used to create a Convolutional Neural Network using PyTorch.

Step 1

Import the necessary packages for creating a simple neural network.


from torch.autograd import Variable

impor torch.nn.functional as F

Step 2

Create a class with batch representation of convolutional neural network. Our batch shape for input
x is with dimension of (3, 32, 32).

class SimpleCNN(torch.nn.Module):

def init (self):

super(SimpleCNN, self). init ()

self.conv1 = torch.nn.Conv2d(3, 18, kernel_size = 3, stride = 1, padding = 1) self.pool =


torch.nn.MaxPool2d(kernel_size = 2, stride = 2, padding = 0)

self.fc1 = torch.nn.Linear(18 * 16 * 16, 64) self.fc2 =

torch.nn.Linear(64, 10)

Step 3

Compute the activation of the first convolution size changes from (3, 32, 32) to (18, 32, 32).

Size of the dimension changes from (18, 32, 32) to (18, 16, 16). Reshape data dimension of the input
layer of the neural net due to which size changes from (18, 16, 16) to (1, 4608).

Recall that -1 infers this dimension from the other given dimension.

def forward(self, x):

x = F.relu(self.conv1(x)) x = self.pool(x)

x = x.view(-1, 18 * 16 *16) x =

F.relu(self.fc1(x))

x = self.fc2(x)

return(x)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy