0% found this document useful (0 votes)
42 views21 pages

DL Unit 4

The document discusses autoencoders, a type of neural network used for unsupervised learning and dimensionality reduction, detailing their purpose of learning efficient data representations. It outlines various types of autoencoders, including denoising, sparse, contractive, and variational autoencoders, each with unique advantages and drawbacks. The document emphasizes the importance of preventing the network from learning the identity function and highlights the training processes involved in these architectures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views21 pages

DL Unit 4

The document discusses autoencoders, a type of neural network used for unsupervised learning and dimensionality reduction, detailing their purpose of learning efficient data representations. It outlines various types of autoencoders, including denoising, sparse, contractive, and variational autoencoders, each with unique advantages and drawbacks. The document emphasizes the importance of preventing the network from learning the identity function and highlights the training processes involved in these architectures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

23CS2902 DEEP LEARNING

UNIT -4 MORE DEEP LEARNING ARCHITECTURES

AUTO ENCODER:

 An autoencoder is a type of artificial neural network used to learn efficient


data codings in an unsupervised manner.

 The goal of an autoencoder is to learn a representation for a set of data,


usually for dimensionality reduction by training the network to ignore signal
noise.
 Along with the reduction side, a reconstructing side is also learned, where
the autoencoder tries to generate from the reduced encoding a representation
as close as possible to its original input.
 This helps autoencoders to learn important features present in the data.
 When a representation allows a good reconstruction of its input then it has
retained much of the information present in the input.

TYPES OF AUTO ENCODERS:

There are, basically, 7 types of autoencoders:

 Denoisingautoencoder
 Sparse Autoencoder
 Deep Autoencoder
 Contractive Autoencoder
 UndercompleteAutoencoder
 Convolutional Autoencoder
 VariationalAutoencoder

DENOISING AUTOENCODER:

 A DenoisingAutoencoder is a modification on the autoencoder to prevent


the network learning the identity function.

 Autoencoders are Neural Networks which are commonly used for feature
selection and extraction. However, when there are more nodes in the
hidden layer than there are inputs, the Network is risking to learn the so-
called
“Identity Function”, also called “Null Function”, meaning that the
output equals the input, marking the Autoencoder useless.
 DenoisingAutoencoders solve this problem by corrupting the data on
purpose by randomly turning some of the input values to zero. In general,
the percentage of input nodes which are being set to zero is about 50%.
Other sources suggest a lower count, such as 30%. It depends on the amount
of data and input nodes you have.
 Specifically, if the autoencoder is too big, then it can just learn the data, so
the output equals the input, and does not perform any useful representation
learning or dimensionality reduction.
 When calculating the Loss function, it is important to compare the output
values with the original input, not with the corrupted input. That way, the
risk of learning the identity function instead of extracting features is
eliminated.

Advantages-

 It was introduced to achieve good representation. Such a representation is


one that can be obtained robustly from a corrupted input and that will be
useful for recovering the corresponding clean input.
 Corruption of the input can be done randomly by making some of the input
as zero. Remaining nodes copy the input to the noised input.
 Minimizes the loss function between the output node and the corrupted
input.
 Setting up a single-thread denoisingautoencoder is easy.
 Drawbacks-

 To train an autoencoder to denoise data, it is necessary to perform


preliminary stochastic mapping in order to corrupt the data and use as input.
 This model isn't able to develop a mapping which memorizes the training
data because our input and target output are no longer the same.

2) Sparse Autoencoder
 Sparse autoencoders have hidden nodes greater than input nodes.
 They can still discover important features from the data.
 A generic sparse autoencoder is visualized where the obscurity of a node
corresponds with the level of activation.
 Sparsity constraint is introduced on the hidden layer. This is to prevent
output layer copy input data.
 Sparsity may be obtained by additional terms in the loss function during the
training process, either by comparing the probability distribution of the
hidden unit activations with some low desired value,or by manually zeroing
all but the strongest hidden unit activations.
 Some of the most powerful AIs in the 2010s involved sparse autoencoders
stacked inside of deep neural networks.
The structure of an SAE and what makes it different from an Undercomplete
AE

The purpose of an autoencoder is to encode important information efficiently. A


common approach to achieve that is by creating a bottleneck, which forces the
model to preserve what’s essential and discard unimportant bits.

Autoencoder can distinguish between what’s essential by simultaneously


training an encoder and decoder, with the goal of the decoder being the
recreation of original data from encoded representation.

The diagram below provides an example of an UndercompleteAutoencoder Neural


Network with the bottleneck in the middle.

UndercompleteAutoencoder architecture. Image by author, created


using AlexNail’s NN-SVG tool.
Meanwhile, the goal of an SAE is the same as an Undercomplete AE, but it
achieves it differently. Instead of (or in addition to) relying on fewer neurons, SAE
uses regularisation to enforce sparsity.

By sparsity, we mean that fewer neurons can be activated at the same time,
creating an information bottleneck similar to that of Unercomplete AE. See the
below illustration.

Advantages-

 Sparse autoencoders have a sparsity penalty, a value close to zero but not
exactly zero. Sparsity penalty is applied on the hidden layer in addition to
the reconstruction error. This prevents overfitting.
 They take the highest activation values in the hidden layer and zero out the
rest of the hidden nodes. This prevents autoencoders to use all of the hidden
nodes at a time and forcing only a reduced number of hidden nodes to be
used.
Drawbacks-

 For it to be working, it's essential that the individual nodes of a trained


model which activate are data dependent, and that different inputs will result
in activations of different nodes through the network.

Contractive Autoencoder
 The objective of a contractive autoencoder is to have a robust learned
representation which is less sensitive to small variation in the data.
 Robustness of the representation for the data is done by applying a penalty
term to the loss function.
 Contractive autoencoder is another regularization technique just like sparse
and denoisingautoencoders.
 However, this regularizer corresponds to the Frobenius norm of the Jacobian
matrix of the encoder activations with respect to the input.
 Frobenius norm of the Jacobian matrix for the hidden layer is calculated with
respect to input and it is basically the sum of square of all elements.
https://www.geeksforgeeks.org/contractive-autoencoder-cae/

Advantages-

 Contractive autoencoder is a better choice than denoisingautoencoder to learn


useful feature extraction.
 This model learns an encoding in which similar inputs have similar encodings.
Hence, we're forcing the model to learn how to contract a neighborhood of inputs
into a smaller neighborhood of outputs.
VariationalAutoencoder:

Variationalautoencoder is different from autoencoder in a way such


that it provides a statistic manner for describing the samples of the
dataset in latent space. Therefore, in variationalautoencoder, the
encoder outputs a probability distribution in the bottleneck layer
instead of a single output value.
Advantages-

 It gives significant control over how we want to model our latent distribution
unlike the other models.
 After training you can just sample from the distribution followed by
decoding and generating new data.
Drawbacks-
 When training the model, there is a need to calculate the relationship of each
parameter in the network with respect to the final output loss using a
technique known as backpropagation. Hence, the sampling process requires
some extra attention.
Deep Learning Srihari

5. Denoising Autoencoders
• An autoencoder that receives a corrupted data point
as input and is trained to predict the original,
uncorrupted data point as its output
• Traditional autoencoders minimize L(x, g ( f (x)))
• where L is a loss function penalizing g( f (x)) for being
dissimilar from
x, such as L2 norm of difference: mean squared error
• A DAE
L(x, g(f (x! )))
minimizes
• wher x! is a copy of x that is corrupted by some form of
e noise
• The autoencoder must undo this corruption rather
than simply copying their input
3
Deep Learning Srihari

Example of Noise in a DAE


• An autoencoder with high capacity can end up
learning an identity function (also called null
function) where input=output
• A DAE can solve this problem by corrupting the data
input
• How much noise to add?
• Corrupt input nodes by setting 30-50% of random input nodes to
zero

Original input, corrupted data, reconstructed


data
4
Deep Learning Srihari

DAE Training procedure


• Computational graph of cost function below
• DAE trained to reconstruct clean data point x from the
corrupted Accomplished by minimizing loss L=-log
pencoder(x|h=f(x))

Corruption process, C( x! |x) is a conditional


distribution over corrupted x! given the
samples
data sample x

The autoencoder learns a reconstruction


distribution preconstruct(x| x!) estimated from
training pairs (x,x! ) as follows:
1 Sample a training sample x from the training data
2. Sample a corrupted version x! from C(x! |x)
3. Use (x,x! ) as a training example for
estimating the autoencoder distribution
precoconstruct(x| x!) =pdecoder(x|h) with h the output of
encoder f (x! ) and pdecoder typically defined by a
decoder g(h)
• DAE performs SGD on the expectation Ex! ~p^data(x) log p decoder (x|h=fx(! ))
Deep Learning Srihari

Estimating the Score


• An autoencoder can be based on encouraging the
model to have the same score as the data
distribution at every training point x
• The score is a particular gradient ∇x log p(x)
field is:
• Learning the gradient field of log pdata is one way to learn the
structure of
pdata itself
• Score Matching works by fitting the slope (score) of
the model density to the slope of the true underlying
density at the data points
• DAE, with conditionally Gaussian p(x|h), estimates
this score as (g(f(x)-x)
• The DAE is trained to minimize ||g(f( x!)-x)||2
• DAE estimates a vector fields as illustrated next

7
Deep Srihari
Learning DAE learns a vector
field

• Training examples x lie on a low-dimensional manifold


• Training examples x are red crosses
• Gray circle is equiprobable corruptions
• The vector field (g(f(x)-x), indicated by green arrows,
estimates
the ∇x log which is the slope of the density of
score p(x) data
8
Deep Learning Srihari

Vector field learnt by a DAE

• 1-D curved manifold near which the data concentrate


• Each arrow proportional to reconstruction minus input
vector of DAE and points towards higher probability
• Where probability is maximum arrows shrink

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy