0% found this document useful (0 votes)
9 views

MLT UNIT-4 & 5 imp sol

Uploaded by

shortslover626
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

MLT UNIT-4 & 5 imp sol

Uploaded by

shortslover626
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

UNIT-4

concept of convolutional neural network


Concept of Convolutional Neural Network (CNN)

A Convolutional Neural Network (CNN) is a type of deep learning neural network designed
primarily for processing structured grid-like data, such as images or time-series data. CNNs are
particularly effective for tasks like image recognition, object detection, and natural language
processing due to their ability to automatically and adaptively learn spatial hierarchies of
features from data.

Key Components of CNN:

1. Convolution Layer:
○ The core building block of CNN.
○ Performs a convolution operation by sliding a filter (kernel) over the input data to
detect local patterns such as edges, corners, and textures.
○ Output: A feature map that highlights the detected patterns.
2. Mathematical Operation:

3. ReLU (Rectified Linear Unit):


○ Introduces non-linearity to the model by applying the activation function:

○ Helps the network learn complex patterns by allowing non-linear transformations.


4. Pooling Layer:
○ Reduces the spatial dimensions of feature maps while retaining important
features.
○ Common pooling techniques:
■ Max Pooling: Selects the maximum value in a region.
■ Average Pooling: Calculates the average value in a region.
○ Helps in:
■ Reducing computational complexity.
■ Making the model invariant to small translations in the input.
5. Fully Connected Layer:
○ A dense layer where all neurons are connected to the previous layer.
○ Combines the extracted features to make predictions.
6. Dropout (Optional):
○ A regularization technique to prevent overfitting by randomly dropping a fraction
of neurons during training.
Workflow of CNN:

1. Input Layer:
○ Accepts raw input, such as an image represented as a multi-dimensional array
(e.g., RGB image as a 3D array).
2. Feature Extraction:
○ Sequential layers of convolution, ReLU, and pooling extract hierarchical
features from the input.
3. Flattening:
○ The multi-dimensional feature map is flattened into a 1D vector for input to the
fully connected layer.
4. Classification:
○ The flattened vector is passed through one or more fully connected layers and
finally a softmax or sigmoid activation function to output class probabilities.

Applications of CNN:

1. Image Classification: Identifying objects or features in images (e.g., recognizing


handwritten digits in the MNIST dataset).
2. Object Detection: Locating and identifying objects in an image (e.g., YOLO, R-CNN).
3. Medical Imaging: Detecting diseases from X-rays, MRIs, etc.
4. Natural Language Processing (NLP): Analyzing sentence structures or understanding
texts.
5. Autonomous Vehicles: Detecting road signs, pedestrians, and lanes.

Advantages of CNN:

● Reduces the number of parameters compared to traditional fully connected networks,


making them computationally efficient.
● Effectively captures spatial and hierarchical patterns in data.
● Automatically extracts features, reducing the need for manual feature engineering.

Limitations of CNN:

● Requires large datasets and significant computational power for training.


● Vulnerable to adversarial attacks where small perturbations can mislead the model.
● Lack of interpretability in some cases due to the "black box" nature of deep learning.
Types of layers – (Convolutional Layers, Activation function, pooling, fully connected

Types of Layers in a Convolutional Neural Network (CNN)

A CNN is composed of several specialized layers, each serving a distinct purpose in the feature
extraction and classification process. Below are the key types of layers used in CNNs:

1. Convolutional Layers

● Purpose: Extract local features by performing a convolution operation over the input.
● How it works:
○ A small filter (or kernel) slides over the input data to compute the dot product
between the filter values and the input at each location.
○ The output is a feature map that represents the detected features.
● Key Parameters:
○ Filter Size: Dimensions of the kernel (e.g., 3x3, 5x5).
○ Stride: The number of steps the filter moves during convolution.
○ Padding: Adding extra layers of zeros around the input to preserve dimensions.
■ Same Padding: Keeps the output dimensions the same as the input.
■ Valid Padding: No padding is added, reducing the output size.
● Example:
○ Input: 28x28 grayscale image.
○ Kernel: 3x3 filter.
○ Output: Feature map showing edges, corners, or textures.

2. Activation Function Layer

● Purpose: Introduce non-linearity to the model, allowing it to learn complex patterns.


● Common Activation Functions:
1.

3. Pooling Layers

● Purpose: Reduce spatial dimensions of feature maps while retaining the most important
information.
● Types of Pooling:
○ Max Pooling:
■ Selects the maximum value in each region.
■ Highlights the most prominent features.
○ Average Pooling:
■ Computes the average value in each region.
■ Retains more global information.
● Key Parameters:
○ Filter Size: The dimensions of the pooling region (e.g., 2x2, 3x3).
○ Stride: Steps the pooling operation moves across the feature map.
● Benefits:
○ Reduces computational load.
○ Provides invariance to small translations in the input.
● Example:
○ Input: 4x4 feature map.
○ Max Pooling (2x2, stride 2): Reduces it to a 2x2 feature map.

4. Fully Connected Layers

● Purpose: Connect all neurons in one layer to every neuron in the next layer to perform
classification.
● How it works:
○ Flattens the output of the previous layers (e.g., a 2D feature map) into a 1D
vector.
○ Applies matrix multiplication and biases to produce outputs.
● Key Features:
○ Typically used at the end of a CNN to combine extracted features into a final
decision.
○ Outputs probabilities (via softmax) or a specific value depending on the task.
● Example:
○ Flattened input: [0.5, 0.7, 0.1, 0.8].
○ Output: [0.2 (class 1), 0.8 (class 2)].

Summary of the Role of Each Layer:


Layer Type Function

Convolutional Detect local patterns (edges, corners, textures) by applying filters over
the input.

Activation Introduce non-linearity, enabling the network to learn complex


Function relationships.

Pooling Downsample feature maps, reducing spatial dimensions and


computation while retaining important features.

Fully Connected Combine all features extracted by previous layers into a final decision
or prediction.

Derivation of Backpropagation Algorithm,


Derivation of Backpropagation Algorithm

The Backpropagation Algorithm is the key method for training neural networks, including
Convolutional Neural Networks (CNNs). It computes the gradients of the loss function with
respect to the weights and biases of the network using the chain rule of calculus. These
gradients are then used to update the parameters during training via optimization techniques
such as Gradient Descent.

1. Preliminaries

2. ​.
3.
4

6.
Advantages of Backpropagation

● Efficient computation of gradients.


● Scales to large networks and datasets.
● Allows deep networks to learn complex patterns.

Limitations

● Prone to vanishing or exploding gradients in very deep networks.


● Requires a differentiable loss function and activation functions.

Generalization, (Single-layer feed forward network, Multilayer feed forward network, and
Single node with its own feedback)

Generalization in Neural Networks

Generalization refers to the ability of a neural network to perform well on unseen data (test
data) after being trained on a training dataset. A model generalizes well if it effectively captures
the underlying patterns of the data without overfitting or underfitting.

Types of Neural Networks:

1. Single-Layer Feedforward Network

● Structure:
○ Consists of one input layer and one output layer.
○ Neurons in the input layer are directly connected to the output layer, with no
hidden layers in between.
● Functionality:
○ Computes a simple mapping from inputs to outputs using linear or non-linear
activation functions.
● Use Cases:
○ Suitable for solving linearly separable problems (e.g., perceptron model).
● Advantages:
○ Simplicity and computational efficiency.
● Limitations:
○ Cannot solve non-linear problems due to the lack of hidden layers.

2. Multilayer Feedforward Network (Multilayer Perceptron or MLP)

● Structure:
○ Consists of an input layer, one or more hidden layers, and an output layer.
○ Neurons in each layer are fully connected to the next layer.
● Functionality:
○ Introduces hidden layers to enable the network to learn complex, non-linear
mappings.
○ Uses activation functions like ReLU, sigmoid, or tanh in hidden layers.
● Learning:
○ Trained using backpropagation to minimize the loss function.
● Use Cases:
○ Regression, classification, and other tasks requiring modeling of complex
relationships.
● Advantages:
○ Can approximate any continuous function (universal approximation theorem).
● Limitations:
○ Requires careful tuning of hyperparameters (e.g., number of layers, learning
rate).

3. Single Node with Its Own Feedback (Recurrent Structure)

● Structure:
○ A single neuron with a feedback loop connecting its output back to its input.
● Functionality:
○ Can store and process temporal information.
○ Models time-dependent processes such as time-series data or sequential
patterns.
● Example:
○ The simplest form of a recurrent neural network (RNN).
● Learning:
○ Feedback allows it to maintain a memory of past inputs, making it suitable for
dynamic systems.
● Use Cases:
○ Control systems, time-series prediction, or simple state-based models.
● Advantages:
○ Simple and effective for problems with temporal dependencies.
● Limitations:
○ Limited expressiveness compared to more advanced recurrent networks like
RNNs or LSTMs.

Comparison of the Networks:


Feature Single-Layer Multilayer Feedforward Single Node with
Feedforward Feedback

Structure No hidden layers One or more hidden layers Single node with
feedback

Complexity Low Medium to high Low to medium

Capability Linear problems only Non-linear and complex Time-dependent


problems problems

Learning Simple rule (e.g., Backpropagation Temporal learning


perceptron)

Use Case AND, OR, XOR gates Image classification, Time-series prediction
Examples regression

Advantages Simplicity Handles complex Memory for temporal


relationships patterns

Limitations Cannot solve Computational cost, prone Limited to simple


non-linear problems to overfitting feedback patterns

Generalization in These Networks

1. Single-Layer Feedforward Networks:


○ Generalize poorly for non-linear problems as they lack hidden layers.
○ Can achieve good generalization for linearly separable tasks.
2. Multilayer Feedforward Networks:
○ Achieve better generalization by capturing complex patterns in data.
○ Regularization techniques (e.g., dropout, L2 regularization) help improve
generalization by preventing overfitting.
3. Single Node with Feedback:
○ Generalization depends on the type of feedback and the task.
○ Suitable for problems where temporal relationships are critical.

UNIT-5
Roulette-wheel based on fitness and Roulette-wheel based on rank

Roulette-Wheel Selection in Evolutionary Algorithms


Roulette-wheel selection (or fitness-proportionate selection) is a common technique used in
genetic algorithms to probabilistically select individuals for reproduction based on their fitness.
The probability of selection for an individual is proportional to its fitness or its rank.

1. Roulette-Wheel Based on Fitness

Description:
In this method, the probability of selecting an individual is proportional to its fitness value. Higher
fitness individuals are more likely to be selected.

Steps:

1.

Advantages:

● Encourages selection of higher fitness individuals, improving convergence.


● Simple to implement.

Disadvantages:

● Premature Convergence: Extremely fit individuals may dominate, leading to loss of


diversity.
● Scaling Problem: If fitness values vary widely, selection may overly favor a few
individuals.

2. Roulette-Wheel Based on Rank

Description:
Instead of using fitness values directly, individuals are ranked based on their fitness. The
probability of selection is based on the rank, not the actual fitness value. This mitigates issues
with fitness scaling.

Steps:

1.

Advantages:

● Reduces dominance of high fitness individuals, maintaining diversity.


● Handles problems with large variations in fitness values.

Disadvantages:

● Can slow down convergence since fitness differences are not directly used.

Comparison: Fitness-Based vs. Rank-Based Roulette Wheel


Feature Fitness-Based Roulette Rank-Based Roulette

Basis for Selection Direct fitness values Rank of individuals

Probability Proportional to fitness values Proportional to ranks


Assignment

Effect of Fitness Highly affected (e.g., if fitness Less affected


Scaling values are extreme)

Diversity Low (can lead to premature High (balances exploration and


Maintenance convergence) exploitation)

Convergence Faster but risks premature Slower but more robust to scaling
Speed convergence issues

Use Cases Suitable for problems with Suitable for problems with large
well-scaled fitness values fitness variations

Example

Fitness-Based Roulette:

Population fitness values:


Individual A: 10, B: 20, C: 30, D: 40\text{Individual A: 10, B: 20, C: 30, D: 40}Individual A: 10, B:
20, C: 30, D: 40

● Total fitness: 10+20+30+40=10010 + 20 + 30 + 40 = 10010+20+30+40=100.


● Probabilities:
○ P(A)=10/100=0.1P(A) = 10/100 = 0.1P(A)=10/100=0.1
○ P(B)=20/100=0.2P(B) = 20/100 = 0.2P(B)=20/100=0.2
○ P(C)=30/100=0.3P(C) = 30/100 = 0.3P(C)=30/100=0.3
○ P(D)=40/100=0.4P(D) = 40/100 = 0.4P(D)=40/100=0.4.

Rank-Based Roulette:

Population ranks:
Individual A (Rank 4), B (Rank 3), C (Rank 2), D (Rank 1)\text{Individual A (Rank 4), B (Rank 3),
C (Rank 2), D (Rank 1)}Individual A (Rank 4), B (Rank 3), C (Rank 2), D (Rank 1)

● Total rank: 4+3+2+1=104 + 3 + 2 + 1 = 104+3+2+1=10.


● Probabilities:
○ P(A)=4/10=0.4P(A) = 4/10 = 0.4P(A)=4/10=0.4
○ P(B)=3/10=0.3P(B) = 3/10 = 0.3P(B)=3/10=0.3
○ P(C)=2/10=0.2P(C) = 2/10 = 0.2P(C)=2/10=0.2
○ P(D)=1/10=0.1P(D) = 1/10 = 0.1P(D)=1/10=0.1.

When to Use Each Method

1. Fitness-Based:
○ When fitness values are well-scaled and represent individual performance
accurately.
○ Faster convergence is desired, and diversity is not a major concern.
2. Rank-Based:
○ When fitness values vary widely or are not linearly scaled.
○ To maintain diversity and prevent premature convergence.

, Components, GA cycle of reproduction,

Components of a Genetic Algorithm (GA)

A Genetic Algorithm (GA) is inspired by natural selection and evolution. Its components are
designed to mimic the processes of reproduction, mutation, and survival of the fittest.

1. Initial Population

● Definition: A set of potential solutions (individuals or chromosomes) to the optimization


problem.
● Representation:
○ Binary Strings: Each individual is represented as a binary string (e.g., 101010).
○ Real Numbers: Represents continuous values for optimization problems.
○ Permutations: Useful in problems like the traveling salesman problem.
● Size: The population size affects the diversity of solutions and computational complexity.

2. Fitness Function

● Definition: A function that evaluates how good a solution (individual) is for the given
problem.
● Purpose:
○ Assigns a fitness value to each individual.
○ Guides the selection of individuals for reproduction.
● Examples:
○ In a maximization problem, fitness could be the objective function value.
○ In a minimization problem, fitness could be the inverse of the objective value.
3. Selection

● Definition: The process of choosing individuals for reproduction based on their fitness.
● Techniques:
○ Roulette-Wheel Selection: Probabilistic selection based on fitness or rank.
○ Tournament Selection: Randomly selects individuals and chooses the best
among them.
○ Rank-Based Selection: Selection probability is based on rank, not raw fitness.
○ Elitism: Guarantees the best individuals are carried over to the next generation.

4. Crossover (Recombination)

● Definition: Combines genetic material from two parent individuals to create offspring.
● Types:
○ Single-Point Crossover: A single point is selected, and parts are swapped
between parents.
○ Two-Point Crossover: Two points are selected, and segments between them
are swapped.
○ Uniform Crossover: Each gene in the offspring is randomly selected from one of
the parents.
● Purpose:
○ Introduces diversity by combining traits from both parents.

5. Mutation

● Definition: Introduces random changes to individuals to maintain diversity and avoid


local optima.
● Types:
○ Bit Flip: For binary chromosomes, flips a bit (e.g., 0 → 1, 1 → 0).
○ Gaussian Mutation: Adds a small random value to real-valued genes.
○ Swap Mutation: Swaps two genes in a permutation representation.
● Purpose:
○ Prevents premature convergence.
○ Introduces variability into the population.

6. Termination Criteria

● Definition: Conditions under which the algorithm stops.


● Common Criteria:
○ A predefined number of generations is reached.
○ Fitness of the best individual does not improve significantly over time.
○ A satisfactory solution is found.

GA Cycle of Reproduction

The Genetic Algorithm follows an iterative process (cycle) to evolve the population over
successive generations:

Step 1: Initialization

● Generate the initial population randomly or using heuristics.

Step 2: Evaluation

● Calculate the fitness of each individual in the population using the fitness function.

Step 3: Selection

● Select individuals based on their fitness using techniques like roulette-wheel selection or
tournament selection.

Step 4: Crossover

● Pair selected individuals (parents) and perform crossover to produce offspring.


● Offspring inherit genetic material from both parents.

Step 5: Mutation

● Apply mutation to offspring to introduce random variations.

Step 6: Replacement

● Replace less fit individuals in the population with new offspring.


● Use elitism if necessary to retain the best solutions.

Step 7: Termination Check

● If the termination criteria are met, stop the algorithm.


● Otherwise, go back to Step 2 and repeat the cycle.
Illustration of the GA Cycle

1.

Genetic Programming: optimization of travelling salesman problem using genetic


algorithm
T

Genetic Algorithm for Optimizing the Traveling Salesman Problem (TSP)

The Traveling Salesman Problem (TSP) involves finding the shortest possible route that visits a
set of cities exactly once and returns to the starting city. Solving TSP using a Genetic
Algorithm (GA) involves encoding routes as chromosomes and evolving them over generations
to minimize the total distance traveled.

Steps to Solve TSP Using GA


3. Genetic Operators

1. Selection:
○ Select parents based on their fitness. Common methods:
■ Roulette-Wheel Selection: Probabilities are proportional to fitness.
■ Tournament Selection: Randomly pick a subset of individuals and select
the best one.
2. Crossover (Recombination):
○ Combines two parent routes to produce offspring while preserving valid city
permutations.
○ Common Crossover Methods:
■ Partially Mapped Crossover (PMX):
■ Randomly choose two crossover points.
■ Exchange segments between parents while keeping the rest of the
route valid.
■ Order Crossover (OX):
■ Randomly choose a segment from one parent.
■ Fill the rest of the route with cities from the other parent in the
order they appear.
3. Mutation:
○ Introduces random variations to maintain diversity in the population.
○ Common Mutation Methods:
■ Swap Mutation:
■ Swap the positions of two cities in the route.
■ Inversion Mutation:
■ Reverse the order of cities in a randomly selected segment.
■ Scramble Mutation:
■ Randomly shuffle a subset of cities in the route.
4. Replacement:
○ Replace less fit individuals in the population with new offspring.
○ Use elitism to preserve the best solutions.

4. Algorithm Cycle

1. Initialize Population:
○ Generate random permutations of cities.
2. Evaluate Fitness:
○ Calculate the fitness of each route in the population.
3. Selection:
○ Select parents based on fitness.
4. Crossover:
○ Perform crossover to generate offspring.
5. Mutation:
○ Apply mutation to offspring.
6. Replacement:
○ Replace the least fit individuals in the population.
7. Repeat:
○ Repeat the process for a predefined number of generations or until convergence.
8. Output:
○ The best solution (route with the shortest distance).

Example

Cities and Distances


Let the distances between 4 cities (A, B, C, D) be:

A B C D

A 0 10 15 20

B 10 0 35 25

C 15 35 0 30

D 20 25 30 0

1.


Advantages of GA for TSP

1. Can handle large search spaces and complex constraints.


2. Does not require problem-specific heuristics.
3. Effective for finding approximate solutions to NP-hard problems like TSP.

Limitations

1. Computationally expensive for large populations and cities.


2. Premature convergence to local optima if diversity is not maintained.
3. Requires fine-tuning of parameters (population size, mutation rate, etc.)

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy