0% found this document useful (0 votes)
49 views18 pages

Unit IV - Learning

Uploaded by

Yashaswini Gowda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views18 pages

Unit IV - Learning

Uploaded by

Yashaswini Gowda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

UNIT IV

Learning

Learning– Forms of Learning, Supervised Learning, Machine Learning - Decision Trees, Regression and Classification
with Linear Models, Artificial Neural Networks, Support Vector Machines

4.1 Learning – Introduction

Machines cannot be called intelligent until they are able to learn to do new things and to adopt new situations,
rather than simply doing as they are told to do.
If the programmer program the machine to learn, to improve automatically with experience then the impact
would be dramatic.
If computers learn from medical records, then treatments are efficient for new
diseases, if smart houses learning from experience to optimize energy costs based on the
particular usage patterns of their occupants, then throughput will be proficient, or if personal software assistants(PSA)
learning the evolving interests of their users in order to highlight especially relevant stories from the online morning
newspaper, then such type of PSA is preferable.

Machine learning is used in many different applications, from image and speech recognition to natural language
processing, recommendation systems, fraud detection, portfolio optimization, automated task, and so on. Machine
learning models are also used to power autonomous vehicles, drones, and robots, making them more intelligent and
adaptable to changing environments.

In recent years, deep learning, a subset of machine learning, has gained significant attention and has been
particularly successful in tasks such as image recognition, natural language processing, and speech recognition. Deep
learning models such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), learn hierarchical
representations of data through multiple layers of abstraction, often leading to state-of-the-art performance in various
domains.

Learning in AI encompasses a wide range of techniques and methods aimed at enabling machines to improve
their performance on specific tasks through experience or data.

4.1.2 Forms of Learning

Learning in AI refers to the process by which machines acquire knowledge or skills to perform tasks. AI learning
can broadly be categorized into three main types, namely

1) supervised learning
2) unsupervised learning
3) reinforcement learning
Supervised Learning
In supervised learning, the algorithm learns from labeled data, where each input is associated with a
corresponding target output. The goal is to learn a mapping from inputs to outputs. During training, the algorithm adjusts
its parameters to minimize the difference between its predictions and the actual targets.
Examples of supervised learning tasks include classification (where the output is categorical) and regression
(where the output is continuous).
Common algorithms for supervised learning include decision trees, support vector machines (SVM), linear
regression, logistic regression, neural networks, and collective methods like random forests and gradient boosting.

Advantages
 Since supervised learning work with the labelled dataset so we can have an exact idea about the classes of
objects.
 These algorithms are helpful in predicting the output on the basis of prior experience.

Disadvantages
 These algorithms are not able to solve complex tasks.
 It may predict the wrong output if the test data is different from the training data.
 It requires lots of computational time to train the algorithm.

Applications
Image classification
In image classification, used to identify objects, faces, and other features in images.
Natural language processing
In NLP, used to extract information from text, such as sentiment, entities, and relationships.
Speech recognition
In speech recognition systems, used to convert spoken language into text.
Recommendation systems
In recommendation systems, used to make personalized recommendations to users.
Predictive analytics
In predictive analytics, used to predict outcomes, sales and stock prices.
Medical diagnosis
In Medical field, used to detect diseases and other medical conditions.
Fraud detection
Used to identify fraudulent transactions.
Autonomous vehicles
During autonomous driving, used to recognize and respond to objects in the environment.
Email spam detection
Used to classify emails as spam or not spam.
Quality control
In manufacturing, used to inspect products for defects.
Gaming
In game world, used to recognize characters, analyze player behavior, and to create NPCs.
Customer support
It Automates customer support tasks.
Weather forecasting
Used to make predictions for temperature, precipitation, and other meteorological parameters.
Sports analytics
Analyze player performance, make game predictions, and optimize strategies.

Unsupervised Learning
In unsupervised learning, the algorithm learns from unlabeled data without explicit guidance. The goal is to
discover patterns, structures, or relationships in the data. This can involve tasks such as clustering (grouping similar data
points together) or dimensionality reduction (reducing the number of features while preserving relevant information).
Unsupervised learning is particularly useful when there is a large amount of unlabeled data available or when labeled
data is scarce.
Examples of unsupervised learning tasks include clustering (grouping similar data points together) and
dimensionality reduction (reducing the number of features while preserving the most important information).
Common algorithms for unsupervised learning include k-means clustering, hierarchical clustering, principal
component analysis (PCA), and auto encoders.

Advantages
 Unsupervised learning does not require labeled data, making it applicable to a wider range of datasets and
domains.
 It can uncover hidden structures or patterns in data that may not be apparent to humans.
 Unsupervised learning techniques can often be more scalable than supervised learning, as they do not require
manually labeled data.
Disadvantages
 Without labeled data, unsupervised learning tasks often lack a clear objective function, making it challenging to
evaluate and compare different models.
 Unsupervised learning models may produce complex representations or clusters that are difficult to interpret or
explain.
 Results of unsupervised learning can be subjective and highly dependent on the choice of algorithm and
parameters.
Applications

Clustering
Used to group similar data points into clusters.
Anomaly detection
In Anomaly detection, used to Identify outliers or anomalies in data.
Dimensionality reduction
Reduce the dimensionality of data while preserving its essential information.
Recommendation systems
In recommendation systems, used to suggest products, movies, or content to users based on their historical
behavior or preferences.
Topic modeling
In topic modeling, used to discover hidden topics within a collection of documents.
Density estimation
Used to estimate the probability density function of data.
Image and video compression
Used to reduce the amount of storage required for multimedia content.
Data preprocessing
Used in the data preprocessing tasks such as data cleaning, imputation of missing values, and data scaling.
Market basket analysis
Market basket analysis (MBA) is a data mining technique that helps retailers understand customer purchasing
patterns. Here, discovering associations between products is done by these algorithms
Genomic data analysis
Genomic data analysis involves the use of various technologies to identify patterns and trends in genetic data.
Here used to identify patterns or group genes with similar expression profiles.
Image segmentation
In image segmentation, used to segment images into meaningful regions.
Community detection in social networks
In social networks, used to identify communities or groups of individuals with similar interests or connections.
Customer behavior analysis
In customer behavior analysis, used to uncover patterns and insights for better marketing and product
recommendations.
Content recommendation
In content recommendation, used to classify and tag the content to make it easier to recommend similar items
to users.
Exploratory data analysis (EDA)
Used to explore data and gain insights before defining specific tasks.

Reinforcement Learning
In reinforcement learning, an agent learns to make decisions by interacting with an environment. The agent
receives feedback in the form of rewards or penalties based on its actions, and the goal is to learn a policy that
maximizes the cumulative reward over time.
Through this trial-and-error process, the agent learns to associate certain actions with positive outcomes and
others with negative outcomes, ultimately learning a policy that maximizes cumulative reward over time. Reinforcement
learning is commonly used in tasks such as game playing, robotics, and autonomous vehicle control.
Reinforcement learning is commonly used in settings where explicit supervision is impractical or unavailable,
such as game playing, robotics, and autonomous systems.
Common algorithms for reinforcement learning include Q-learning, Deep Q Networks (DQN), policy gradient
methods, and actor-critic methods.

Advantages
 Reinforcement learning enables agents to learn from interacting with the environment, making it suitable for
tasks where explicit feedback is available.
 Reinforcement learning can handle a wide range of tasks and environments, from simple games to complex real-
world problems.
 Reinforcement learning agents can adapt their behavior over time in response to changes in the environment or
task requirements.

Disadvantages
 Reinforcement learning agents must balance exploration of new actions with exploitation of known good
actions, which can be challenging to optimize.
 Reinforcement learning algorithms often require a large number of interactions with the environment to learn
effective policies, which can be time-consuming and resource-intensive.
 Designing appropriate reward functions that effectively guide the learning process can be difficult, and poorly
designed reward functions may lead to suboptimal behavior or unintended outcomes.

Applications

Game Playing
In Gaming, reinforcement learning can teach agents to play games even if it is complex. And it can be used to
create more intelligent and adaptive NPCs in video games.
Robotics
In Robotics, reinforcement learning can teach robots to perform tasks autonomously.
Autonomous Vehicles
In autonomous driving, reinforcement learning can help self-driving cars to navigate and make decisions.
Recommendation Systems
In recommendation systems, reinforcement learning can enhance recommendation algorithms by learning user
preferences.
Healthcare
In Healthcare, reinforcement learning can be used to optimize treatment plans and drug discovery.
Natural Language Processing (NLP)
In NLP, reinforcement learning can be used in dialogue systems and chatbots.
Finance and Trading
In Finance and trading, reinforcement learning can be used for algorithmic trading.
Supply Chain and Inventory Management
In Supply Chain and Inventory Management, reinforcement learning can be used to optimize supply chain
operations.
Energy Management
In Energy Management, reinforcement learning can be used to optimize energy consumption.
Adaptive Personal Assistants
Reinforcement learning can be used to improve personal assistants.
Virtual Reality (VR) and Augmented Reality (AR)
Reinforcement learning can be used to create immersive and interactive experiences in VR and AR.
Industrial Control
Reinforcement learning can be used to optimize industrial processes.
Education
Reinforcement learning can be used to create adaptive learning systems.
Agriculture
Reinforcement learning can be used to optimize agricultural operations.
Each type of learning has its own set of strengths and weaknesses, and the choice of which to use depends on the
specific requirements and constraints of the problem at hand.

4.2 Supervised Learning

In supervised learning, the algorithm learns from labeled data, where each input is associated with a
corresponding target output. The goal is to learn a mapping from inputs to outputs. During training, the algorithm adjusts
its parameters to minimize the difference between its predictions and the actual targets.

Supervised learning is commonly divided into two categories such as,


1) Classification (assigning inputs to categories)
2) Regression (predicting numerical values)

In classification tasks, the output variable is categorical. The goal is to classify input instances into predefined
classes or categories. In regression tasks, the output variable is continuous. The goal is to predict a numeric value based
on input features.
Suppose we have an input dataset of horse and donkey images. First, we should provide the training to the
machine to understand the images, such as the shape & size of the tail of horse and donkey, Shape of eyes, color, height
(horses are taller, donkeys are smaller), etc.
After completion of training, we input the picture of a horse and donkey, to ask the machine to identify the
object and predict the output. Now, the machine is well trained, so it will check all the features of the object, such as
height, shape, color, eyes, ears, tail, etc., and find that it is a donkey. So, it will put it in the donkey category.

4.2.1 Classification and Regression

Classification and regression are two fundamental types of supervised learning tasks in artificial intelligence (AI).
They involve predicting output values based on input data, but they differ in the nature of the output variable and the
goal of the prediction.

Classification
Classification is a supervised learning task where the goal is to categorize input data into predefined classes or
categories. The output variable (also called the target variable) is categorical, meaning it consists of distinct classes or
labels.
The goal of classification is to learn a mapping from input features to class labels, allowing the model to classify
new, unseen instances into one of the predefined classes.

Example Applications

 Classify emails as spam or non-spam based on their content and metadata.


 In image classification, identify objects, animals, or scenes in images and assign them to predefined categories
 In medical diagnosis, classify patients into different disease categories based on symptoms, medical history, and
test results
 In sentiment analysis, determine the sentiment (positive, negative, neutral) of text data such as product reviews
or social media posts.
Regression
Regression is a supervised learning task where the goal is to predict continuous numerical values based on input
data. The output variable (also called the target variable) is continuous, meaning it can take on any value within a range.

The goal of regression is to learn a mapping from input features to numerical values, allowing the model to
predict a continuous outcome for new, unseen instances.

Example Applications

 Predict the selling price of houses based on features such as location, size, and number of bedrooms
 Forecast future stock prices based on historical stock data, market trends, and external factors
 Predict future demand for products or services based on historical sales data, marketing efforts, and economic
indicators
 Forecast future temperatures based on historical weather data, geographic location, and time of year

Difference between Regression and Classification

Regression Algorithm Classification Algorithm

In Regression, the output variable must be of In Classification, the output variable must be
continuous nature or real value a discrete value

The task of the regression algorithm is to map The task of the classification algorithm is to
the input value (x) with the continuous output map the input value(x) with the discrete
variable(y) output variable(y)

Regression Algorithms are used with Classification Algorithms are used with
continuous data discrete data

In Regression, we try to find the best fit line, In Classification, we try to find the decision
which can predict the output more accurately boundary, which can divide the dataset into
different classes

Regression algorithms can be used to solve the Classification Algorithms can be used to solve
regression problems such as Weather classification problems such as Identification
Prediction, House price prediction, etc of spam emails, Speech Recognition,
Identification of cancer cells, etc

The regression Algorithm can be further divided The Classification algorithms can be divided
into Linear and Non-linear Regression into Binary Classifier and Multi-class Classifier

4.2.2 Machine Learning

Definition
A computer program is said to learn from experience E with respect to some class of tasks T and performance
measure P, if its performance at tasks in T, as measured by P, improves with experience E.

ETP for different learning problem

Checkers learning problem


Task T: playing checkers
Performance measure P: percent of games won against opponents
Training experience E: playing practice games against itself

A handwriting recognition learning problem


Task T: recognizing and classifying handwritten words within images
Performance measure P: percent of words correctly classified
Training experience E: a database of handwritten words with given classifications

A robot driving learning problem


Task T: driving on public four-lane highways using vision sensors
Performance measure P: average distance traveled before an error (as judged by human overseer)
Training experience E: a sequence of images and steering commands recorded while observing a human driver

Examples of Machine Learning

Learning to recognize spoken words


All of the most successful speech recognition systems employ machine learning in some form.
For example, the SPHINX -Super Functional Integrated Communication System learns speaker-specific strategies for
recognizing the primitive sounds (phonemes) and words from the observed speech signal.
Neural network learning methods and methods for learning hidden Markov models
are effective for automatically customizing to individual speakers, vocabularies, microphone characteristics, background
noise, etc. Similar techniques have potential applications
in many signal-interpretation problems.

Learning to drive an autonomous vehicle


Machine learning methods have been used to train computer-controlled vehicles to steer correctly when driving
on a variety of road types. For example, the ALVINN (Autonomous Land Vehicle In a Neural Network) system has used its
learned strategies to drive unassisted at 70 miles per hour for 90 miles on public highways among other cars. Similar
techniques have possible applications in many sensor-based control problems.

Learning to classify new astronomical structures


Machine learning methods have been applied to a variety of large databases to learn general
regularities implicit in the data. For example, decision tree learning algorithms have been used
by NASA to learn how to classify celestial objects from the second Palomar Observatory Sky
Survey. This system is now used to automatically classify all objects in the Sky Survey, which consists of three terabytes of
image data.

Learning to play world-class backgammon


The most successful computer programs for playing games such as backgammon are based on machine learning
algorithms. For example, the world's top computer program for backgammon,
learned its strategy by playing over one million practice games against itself. It now plays at a level competitive with the
human world champion.

4.2.3 Decision Trees

A decision tree is a tree-like structure that represents a series of decisions and their possible consequences. It is
used in machine learning for classification and regression tasks. An example of a decision tree is a flowchart that helps a
person decide what to wear based on the weather conditions.
The process of forming a decision tree involves recursively partitioning the data based on the values of different
attributes. Commonly these algorithms select the best attribute to split the data at each internal node, based on certain
criteria such as information gain or Gini impurity. This splitting process continues until a stopping criterion is met, such as
reaching a maximum depth or having a minimum number of instances in a leaf node.
Decision tree induction is one of the simplest, most successful forms of learning
algorithm and it is easy to implement.
A decision tree takes as input an object or situation described by a set of attributes and
returns a decision that represents the predicted output value for the input. The input attributes can be discrete or
continuous. The output value can also be discrete or continuous. Learning a discrete-valued function is called
classification. Learning a continuous function is called regression. In Boolean classification, each example is classified as
true (positive) or false (negative).
A decision tree reaches its decision by performing a sequence of tests. Each internal node in the tree
corresponds to a test of the value of one of the properties, and the branches from the node are labeled with the possible
values of the test. Each leaf node in the tree specifies the value to be returned if that leaf is reached. The decision tree
representation seems to be very natural for humans and it resembles the manuals (e.g., for car repair) are written
entirely as a single decision tree instead of stretching over hundreds of pages.

Types of decision trees


There are two types of decision trees namely, classification trees and regression trees
Classification trees
Classification trees typically deal with "yes" or "no" questions and are best suited to solve real-world problems
and topics. For example,
Did the customer have a good experience?
Was the order shipment complete?
Regression trees
Regression trees are designed to predict continuous values and are built from historical data. For example,
How many computers will we sell this quarter?
Which store location will have the most traffic on the next holiday?

Parts of a decision tree structure

Root node
A node that has a child or sub-nodes is called root node or parent node. Root node is at the highest level of a
decision tree and it is a starting point from which the rest of the tree grows. This node represents the question, task or
problem.
Internal node or child node
Internal node or child node is a node that has a parent above or preceding it. These may also later be a parent to another
child node. An internal node is the preceding node that branches out into two or more variables.

Decision tree leaf node


Also known as external nodes or terminal nodes, these are the endpoints and have no child nodes. It’s always found
furthest from the root node and is where the answer or solution is found.

Pruning
Pruning is the process of trimming down variables by removing nodes and leaving only the most critical node and with
potential outcomes.

Splitting
Splitting is the process of breaking the nodes into two or more variables. It is opposite of pruning.

Decision tree sub-tree or branch


This is a specific section of a decision tree. It contains multiple internal nodes and potentially some leaf nodes depending
on the specific branch in question.
Branch
It is the lines used to connect nodes at the split points.

Problem Solving using Decision Tree

Problem: Waiting for a table at a restaurant


Problem Statement:
The problem stating that whether to wait for a table at a restaurant or not
Solution:
The aim here is to learn a definition for the goal predicate. Our goal is to predicate whether we have to wait for
a table or not.
In setting this up as a learning problem, first we have to state what attributes are available to
describe the examples in the domain.
Let's suppose we decide on the following list of attributes:
Alternate whether there is a suitable alternative restaurant nearby
Park whether the restaurant has a comfortable park area to wait in
Fri/Sat true on Fridays and Saturdays
Hungry whether we are hungry
Patrons how many people are in the restaurant (values are None, Some, and Full)
Price the restaurant's price range (500, 1000, 3000)
Raining whether it is raining outside
Reservation whether we made a reservation
Type the kind of restaurant (French, Italian, Thai, or burger)
WaitEstimate the wait estimated by the host (0-10 minutes, 10-30, 30-60, >60)

Decision Tree:
Notice that the tree does not use the Price and Type attributes, in effect considering them
to be irrelevant. Examples are processed by the tree starting at the root and following the
appropriate branch until a leaf is reached. For instance, an example with Patrons = Full and
WaitEstimate = 0-10 will be classified as positive (i.e., yes, we will wait for a table).

Advantages of Decision Tree


 Easy to understand and interpret, making them accessible to non-experts
 Handle both numerical and categorical data without requiring extensive preprocessing
 Provides insights into feature importance for decision-making
 Handle missing values and outliers without significant impact
 Applicable to both classification and regression tasks

Disadvantages of Decision Tree


 Disadvantages include the potential for overfitting
 Sensitivity to small changes in data, limited generalization if training data is not representative
 Potential bias in the presence of imbalanced data

Applications of Decision tree


 Decision trees can analyze email content and metadata to classify emails as spam or non-spam
 Medical decision trees can assist doctors in diagnosing diseases based on patient symptoms and medical history
 Decision trees can predict whether customers are likely to churn (i.e., stop using a service or product) based on
their behavior and characteristics
 It can predict the selling price of houses based on features such as location, size, and amenities
 It can analyze historical stock data and external factors to predict future stock prices
 It can predict future demand for products or services based on historical sales data and market trends
 It can detect fraudulent transactions based on patterns of behavior or transactions that deviate from normal
activity
 It can identify abnormal network traffic patterns that may indicate a cyberattack or security breach
 It can predict equipment failures or malfunctions based on sensor data and maintenance records
 It can identify key demographic or behavioral factors that differentiate customer segments for targeted
marketing campaigns
 It can select relevant features from images to improve the accuracy of image recognition and classification
models
 It can assign multiple labels or categories to input data, such as classifying documents into multiple topics or
categories
 It can learn multiple related tasks simultaneously, such as predicting both age and gender from demographic
data

4.2.4 Regression and Classification with Linear Models

Linear Models
A linear model is a type of algorithm in machine learning that creates a formula to predict unknown values.
Linear models are considered as the simplest class of algorithms, and are the building block for many complex machine
learning algorithms.
The term linear model implies that the model is specified as a linear combination of features. Based on training
data, the learning process computes one weight for each feature to form a model that can predict or estimate the target
value.
Linear models are commonly used for classification and regression tasks. There are two types of linear models:

 Linear regression
Used for regression (numerical predictions)
 Logistic regression
Used for classification (categorical predictions)

Regression and classification are two fundamental tasks in supervised learning. Linear models are frequently
used for both tasks due to their simplicity, interpretability, and computational efficiency.
Regression aims to predict a continuous target variable, while classification aims to assign data points to
discrete classes.

Regression with Linear model


Linear regression models the relationship between the independent variables (features) and the dependent
variable (target) using a linear equation
y=w0+w1x1+w2x2+...+wnxn+ϵ
where,
y is the predicted value
x1,x2,….,xn are the input features
w0,w1,….,wn are the model parameters(coefficients)
ϵ represents the error term
The goal is to learn the optimal values of
that minimize the difference between the predicted and actual target values.

Classification with Linear model

In classification tasks, the goal is to assign input data points to one of several predefined classes or categories.
Linear models such as logistic regression and linear support vector machines (SVMs) are commonly used for binary and
multiclass classification tasks.
Logistic regression is a linear classification algorithm that models the probability of a binary outcome (e.g., class
0 or class 1) using the logistic function (sigmoid function)

P(y=1∣x)=1+e−(w0+w1x1+w2x2+...+wnxn)1

P(y=1∣x) is the probability of the positive class given input x


x1,x2,...,xn are the input features
w0,w1,...,wn are the model parameters (coefficients)

The decision boundary is linear in logistic regression, separating the feature space into regions corresponding to different
classes.

Artificial Neural Networks


A neural network is a machine learning program, or model, that makes decisions in a manner similar to the
human brain, by using processes that mimic the way biological neurons work together to identify phenomena, weigh
options and arrive at conclusions.
Neural networks are sometimes called artificial neural networks (ANNs) or simulated neural networks (SNNs).
They are a subset of machine learning, and at the heart of deep learning models.

An ANN is a computational model based on the human brain's neural structure. It is made up of interconnected
nodes (neurons) that are organized into layers. ANNs provide a general, practical method for learning real-valued,
discrete-valued, and vector-valued functions from examples. ANN learning is robust to errors in the training data and has
been successfully applied to problems such as interpreting visual scenes, speech recognition, and learning robot control
strategies.

Artificial neurons vs Biological neurons


The concept of artificial neural networks comes from biological neurons found in animal brains So they share a
lot of similarities in structure and function wise.
Structure
The structure of artificial neural networks is inspired by biological neurons. A biological neuron has a cell body or
soma to process the impulses, dendrites to receive them, and an axon that transfers them to other neurons. The input
nodes of artificial neural networks receive input signals, the hidden layer nodes compute these input signals, and the
output layer nodes compute the final output by processing the hidden layer’s results using activation functions.

Biological Neuron Artificial Neuron


Dendrite Inputs
Cell nucleus or Soma Nodes
Synapses Weights
Axon Output

Synapses
Synapses are the links between biological neurons that enable the transmission of impulses from dendrites to
the cell body. Synapses are the weights that join the one-layer nodes to the next-layer nodes in artificial neurons. The
strength of the links is determined by the weight value.

Learning
In biological neurons, learning happens in the cell body nucleus or soma, which has a nucleus that helps to
process the impulses. An action potential is produced and travels through the axons if the impulses are powerful enough
to reach the threshold. This becomes possible by synaptic plasticity, which represents the ability of synapses to become
stronger or weaker over time in reaction to changes in their activity. In artificial neural networks, backpropagation is a
technique used for learning, which adjusts the weights between nodes according to the error or differences between
predicted and actual outcomes.

Activation
In biological neurons, activation is the firing rate of the neuron which happens when the impulses are strong
enough to reach the threshold. In artificial neural networks, a mathematical function known as an activation function
maps the input to the output, and executes activations.

Basic Components of Artificial Neural Networks

Neurons (Nodes)
Neurons are the basic computational units of ANNs. Each neuron receives input signals, performs a
computation, and produces an output signal.
Weights and Biases
Connections between neurons are represented by weights, which determine the strength of the connection.
Biases are additional parameters that allow neurons to learn different representations of the input data.
Activation Function
Each neuron applies an activation function to the weighted sum of its inputs to produce the output signal.
Common activation functions are sigmoid, tanh, ReLU (Rectified Linear Unit), and softmax.
Layers
Neurons in an ANN are organized into layers. The input layer receives the input data, while the output layer
produces the final output. Hidden layers, situated between the input and output layers, perform intermediate
computations.
Connections
Neurons in adjacent layers are connected through connections, which transmit signals from one layer to
another. Each connection has an associated weight that determines its strength.

Artificial Neural Networks contain artificial neurons which are called units. These units are arranged in a series
of layers that together constitute the whole Artificial Neural Network in a system. A layer can have only a dozen units or
millions of units as this depends on how the complex neural networks will be required to learn the hidden patterns in the
dataset. Commonly, Artificial Neural Network has an input layer, an output layer as well as hidden layers. The input layer
receives data from the outside world which the neural network needs to analyze or learn about. Then this data passes
through one or multiple hidden layers that transform the input into data that is valuable for the output layer. Finally, the
output layer provides an output in the form of a response of the Artificial Neural Networks to input data provided.

Applications of Artificial Neural Networks

Social Media
Artificial Neural Networks are used heavily in Social Media. For example, the ‘People you may know’ feature on
Facebook that suggests people that we might know in real life so that we can send them friend requests. This is achieved
by using Artificial Neural Networks that analyze a person profile, their interests, their current friends, and also their
friends and various other factors to calculate the people might potentially know.

Another common application of Machine Learning in social media is facial recognition. This is done by finding
around 100 reference points on the person’s face and then matching them with those already available in the database
using convolutional neural networks.

Marketing and Sales


When a user logs onto E-commerce sites like Amazon and Flipkart, they will recommend products to buy based
on their previous browsing history. Similarly, Zomato, Swiggy, etc. will show the restaurant recommendations based on
the taste of user and previous order history. This is true across all new-age marketing segments like Book sites, Movie
services, Hospitality sites, etc. and it is done by implementing personalized marketing. This uses Artificial Neural
Networks to identify the customer likes, dislikes, previous shopping history, etc., and then tailor the marketing
campaigns accordingly.

Healthcare
Artificial Neural Networks are used in Oncology to train algorithms that can identify cancerous tissue at the
microscopic level at the same accuracy as trained physicians. Various rare diseases may manifest in physical
characteristics and can be identified in their premature stages by using Facial Analysis on the patient photos. So the full-
scale implementation of Artificial Neural Networks in the healthcare environment can only enhance the diagnostic
abilities of medical experts and ultimately lead to the overall improvement in the quality of medical care all over the
world.

Personal Assistants

The personal assistants such as Siri, Alexa, Cortana, etc., that uses Natural Language Processing to interact with
the users and formulate a response accordingly. Natural Language Processing uses artificial neural networks that are
made to handle many tasks of these personal assistants such as managing the language syntax, semantics, correct
speech, the conversation that is going on, etc.

Support Vector Machines

SVM stands for support vector machine. It is a supervised machine learning algorithm that classifies data by
finding an optimal line or hyperplane. SVMs were developed in the 1990s. SVMs are particularly effective in high-
dimensional spaces, making them suitable for tasks like image classification, text categorization, and bioinformatics.
SVM needs labeled training data, where each data point is associated with a class label. We can choose a kernel
function (linear, polynomial, radial basis function, etc.) based on the characteristics of data.
Then, we should train the SVM model on the training data, which involves finding the optimal hyperplane or
decision boundary that separates the classes with the maximum margin.
Once trained, the SVM model can predict the class labels of new, unseen data points.

SVMs are commonly used within classification problems. They distinguish between two classes by finding the
optimal hyperplane that maximizes the margin between the closest data points of opposite classes. The number of
features in the input data determine if the hyperplane is a line in a 2-D space or a plane in a n-dimensional space. Since
multiple hyperplanes can be found to differentiate classes, maximizing the margin between points enables the algorithm
to find the best decision boundary between classes. This, in turn, enables it to generalize well to new data and make
accurate classification predictions. The lines that are adjacent to the optimal hyperplane are known as support vectors as
these vectors run through the data points that determine the maximal margin.

Linear SVM
Linear SVM uses a linear kernel function to find a linear decision boundary that separates classes in the input
space. It works well when the data is linearly separable, and the decision boundary is a straight line.
For example, suppose we have a dataset of points in a 2-dimensional space, where each point belongs to one of
two classes namely circle and diamond shape. Our task is to build a classifier using a linear SVM to separate the circle
points from the diamond points based on their coordinates. And, goal is to train a linear SVM to find a decision boundary
(a line in this 2D space) that effectively separates the circles from the diamonds.

Example Data

X1 X2 Class
2 3 circle
4 5 circle
1 1 diamond
5 3 diamond
6 5 circle
2 2 diamond

Let's visualize the data points on a 2D plane

Each data point is represented by its features (X1, X2). In this case, each point lies in a 2-dimensional space.
The SVM algorithm aims to find a linear decision boundary (hyperplane) that separates the circles from the diamond.
This decision boundary is defined by the required equation of a line in 2D space.

w1*X1 + w2*X2 + b = 0

The SVM algorithm tries to maximize the margin, which is the distance between the decision boundary and the nearest
data points (support vectors) from each class. The optimization problem involves finding the optimal weights (w1, w2)
and bias (b) parameters that define the decision boundary, while minimizing the classification error. This optimization
problem is typically solved using techniques from convex optimization, such as quadratic programming.The SVM
algorithm tries to maximize the margin, which is the distance between the decision boundary and the nearest data points
(support vectors) from each class.
The optimization problem involves finding the optimal weights (w1, w2) and bias (b) parameters that define the
decision boundary, while minimizing the classification error. This optimization problem is typically solved using
techniques from convex optimization, such as quadratic programming.
The SVM algorithm iteratively adjusts the parameters (weights and bias) to find the hyperplane that best
separates the classes. It assigns class labels (+1 for circle, -1 for diamond) to the data points and finds the optimal
hyperplane that maximizes the margin between the two classes.
After training, the linear SVM finds the optimal decision boundary (hyperplane) that separates the circles from
the diamonds in the 2D space. New data points can be classified by determining which side of the decision boundary they
fall on.

Now, the linear SVM has learned a linear decision boundary (a line) that separates the two classes in the feature
space. This decision boundary can accurately classify new data points into their respective classes based on their
coordinates.

Non-linear SVM
Non-linear SVMs use kernel functions, such as polynomial kernel, Gaussian (RBF) kernel, or sigmoid kernel, to
map the input data into a higher-dimensional space where it may be linearly separable. This allows SVM to handle non-
linear decision boundaries and classify non-linearly separable data.
They can effectively classify data that is not linearly separable by transforming the feature space into a higher-
dimensional space where the classes might become separable.
Let's consider an example where a non-linear SVM is used to classify data that is not linearly separable.
We have a dataset of points in a 2-dimensional space, where each point belongs to one of two classes, namely
circle or diamond. However, the classes are not linearly separable in the original feature space.
Example Data
X1 X2 Class
1 2 Circle
2 1 Circle
1.5 1.5 Circle

4 4 Diamond
5 5 Diamond
4.5 5 Diamond

Visualization
To address this issue, we can use a non-linear SVM with a kernel function. Let's choose the Gaussian Radial Basis
Function (RBF) kernel for this example. The RBF kernel transforms the original feature space into a higher-dimensional
space where the classes might become linearly separable. The decision boundary in this higher-dimensional space can be
represented as a non-linear surface.
The RBF kernel implicitly maps the 2-dimensional input space into a higher-dimensional space. The SVM
algorithm seeks to maximize the margin between the classes in the transformed space, similar to the linear SVM. The
optimization problem involves finding the optimal parameters (including the kernel parameters) that define the decision
boundary while minimizing the classification error.
After training, the non-linear SVM finds a decision boundary (non-linear surface) in the transformed
feature space that effectively separates the red points from the blue points.In the original 2D space, it's difficult to
visualize the non-linear decision boundary. However, in the higher-dimensional space induced by the RBF kernel, the
decision boundary might appear as a complex surface that effectively separates the two classes.

Consider, we have one diamond in the boundary of the circle. Now, SVM classify the data in the following simple way.
The diamond in the boundary of circle is an outlier of circle. The SVM algorithm has the characteristics to ignore the
outlier and finds the best hyperplane that maximizes the margin. SVM is robust to outliers.

Advantages and disadvantages of SVM

Advantages of support vector machine


 SVMs are effective in handling high-dimensional data such as image and text classification.
 SVMs can also perform well with small datasets
 It can model non-linear decision boundaries
 This algorithm is able to classify new, unseen data well.
 It can be used for both classification and regression tasks.
 It is more efficient and less prone to overfitting.

Disadvantages of support vector machine

 SVMs can be computationally expensive for large datasets, as the algorithm requires solving a quadratic
optimization problem.
 The choice of kernel can greatly affect the performance of an SVM, and it can be difficult to determine the best
kernel for a given dataset.
 It results in poor performance when number of properties for each data point increases
 SVMs can be memory-intensive
 Not suitable for large datasets with many features (SVMs can be very slow and can consume a lot of memory
when the dataset has many features)
 Not suitable for datasets with missing values

Applications of support vector machine


SVMs are popular in various applications such as image classification, natural language processing,
bioinformatics, and more.
Handwriting remembrance
It is used for handwriting recognition.
Facial Expression Classification
The face Expression Classification model determines the precise face expression by modeling differences
between two facial images. Validation techniques include the leave-one-out methods and the K-fold test methods.
Speech Recognition
The transcription of speech into text is called speech recognition. Mel Frequency Cepstral Coefficients (MFCC)-
based features are used to train Support Vector Machines (SVM), which are used for figuring out speech. Speech
recognition is a challenging classification problem that is categorized using a variety of mathematical techniques,
including support vector machines, pattern recognition techniques, etc.

Text classification
SVMs are commonly used in natural language processing (NLP) for tasks such as sentiment analysis, spam
detection, and topic modeling. They lend themselves to these data as they perform well with high-dimensional data.

Image classification
SVMs are applied in image classification tasks such as object detection and image retrieval. It can also be useful
in security domains, classifying an image as one that has been tampered with.

Bioinformatics
SVMs are also used for protein classification, gene expression analysis, and disease diagnosis. SVMs are often
applied in cancer research because they can detect subtle trends in complex datasets.
Geographic information system (GIS)
SVMs can analyze layered geophysical structures underground, filtering out the 'noise' from electromagnetic
data. They have also helped to predict the seismic liquefaction potential of soil, which is relevant to field of civil
engineering.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy