Represented Using Tensors, and As A Result, Neural Network Programming Utilizes
Represented Using Tensors, and As A Result, Neural Network Programming Utilizes
Represented Using Tensors, and As A Result, Neural Network Programming Utilizes
Applications
5
NLP, Object detection, Image Processing
●
Tensors
Tensor Properties
6
Rank: Number of tensor axes. A scalar has rank 0, a vector has rank 1, a matrix is rank 2
.
Axis or Dimension: A particular dimension of a tensor.
Size: The total number of items in the tensor, the product shape vector.
immutable
TensorFlow Working (Training and Development)
Neural networks working
●
● Each neuron has an activation function that defines the output of the neuron.
The activation function is used to introduce non-linearity in the modeling
capabilities of the network. We have several options for activation functions that
we will present in this post.
in deep learning model using tensorflow
Part2
● Next we will use a loss function to estimate the loss (or error) and to compare
and measure how good/bad our prediction result was in relation to the correct
result.
● After this using backpropagation each neuron receives the error rate however, the
neurons of the hidden layer only receive a fraction of the total signal of the loss,
based on the relative contribution that each neuron has contributed to the
original output. This process is repeated, layer by layer, until all the neurons in
the network have received a loss signal that describes their relative contribution
to the total loss.
● Now that we have spread this information back, we can adjust the weights of
connections between neurons. What we are doing is making the loss as close as
possible to zero the next time we go back to using the network for a prediction.
For this, we will use a technique called gradient descent.
● For Model parameterization we use Epochs, Batch size , Learning rate
CNN in Tensorflow
Mod 3
Q1.
Q2.
Q3.
https://towardsdatascience.com/learning-process-of-a-deep-neural-network-5a9768d7a651
Deep learning is a type of machine learning and artificial intelligence (AI) that imitates
the way humans gain certain types of knowledge. Deep learning is an important
element of data science, which includes statistics and predictive modeling. It is
extremely beneficial to data scientists who are tasked with collecting, analyzing and
interpreting large amounts of data; deep learning makes this process faster and easier.
Long short-term memory is a modified RNN architecture that addresses the problem of
training over long sequences and retaining memory.
LSTM is best suited for sequence data. LSTM can predict, classify, and generate
sequence data.
Prediction based on the sequence of data is called the sequence prediction. Sequence
prediction is said to have four types.
● Sequence classification
●
● Sequence generation
●
● Sequence-to-sequence prediction
●
Mod 4
Q1.
When there are a small number of training examples, the model sometimes
learns from noises or unwanted details from training examples—to an extent that
it negatively impacts the performance of the model on new examples. This
phenomenon is known as overfitting.
It means that the model will have a difficult time generalizing on a new dataset.
Overfitting refers to a model that models the training data too well.
Overfitting happens when a model learns the detail and noise in the training data
to the extent that it negatively impacts the performance of the model on new
data. This means that the noise or random fluctuations in the training data is
picked up and learned as concepts by the model. The problem is that these
concepts do not apply to new data and negatively impact the models ability to
generalize.
Overfitting is more likely with nonparametric and nonlinear models that have
more flexibility when learning a target function. As such, many nonparametric
machine learning algorithms also include parameters or techniques to limit and
constrain how much detail the model learns.
For example, decision trees are a nonparametric machine learning algorithm that
is very flexible and is subject to overfitting training data. This problem can be
addressed by pruning a tree after it has learned in order to remove some of the
detail it has picked up.
There are multiple ways to fight overfitting in the training process. In this example, we
will use data augmentation and add Dropout to the model.
Q3 .
https://dataaspirant.com/word-embedding-techniques-nlp/#t-1597685144204
The Word2vec method learns all those types of relationships of words while building a
model. For this purpose word2vec uses 2 types of methods. There are
1. Skip-gram
2. CBOW (Continuous Bag of Words)
1. Skip -gram
In this method , take the center word from the window size words as an input and
context words (neighbor words) as outputs. Word2vec models predict the context words
of a center word using skip-gram method. Skip-gram works well with a small dataset
and identifies rare words really well.
CBow is just a reverse method of the skip gram method. Here we are taking context
words as input and predicting the center word within the window. Another difference
from the skip gram method is, It was working faster and better representations for most
frequency words.
Resources
TF-IDF
TF
IDF
Implementation of TF-IDF by using Sklearn
Word2vec
Skip-Gram
Continuous Bag-of-words
Word2vec implementation
Word embedding model using Pre-trained models
Google word2vec
Stanford Glove Embeddings