ML 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

ECE 401D -

Machine
learning
By Dr. Yang Saring
Ni t Arunachal Pradesh
What is machine learning ?
 As per the 1959 definition of Arthur Samuel, machine learning can be defined as a
process of inputting data to the computer systems in a way that the computer will learn the
ability to process and perform the activity in the future without being explicitly programmed
or being fed with similar or extra data.
 deals with make/create computer programs that automatically improve with experience
 The machine formulates decisions based on experience and mimicks the process of
human-based decision-making
 Learning is the process of converting experience into expertise or knowledge.
 The input to a learning algorithm is training data, representing experience, and
the output is some expertise, which usually takes the form of another computer
program that can perform some task.
While human learners can rely on common sense to filter out random
meaningless learning conclusions, once we export the task of learning to a
machine, we must provide well defined crisp principles that will protect the
program from reaching senseless or useless conclusions. The development of
such principles is a central goal of the theory of machine learning.
Traditional Programming Vs
Machine Learning
Traditional Programming Vs Machine
Learning
 In ML Data is fed to the machine instead of command, an algorithm is selected,
hyperparameters (settings) are configured and adjusted, and the machine is instructed to
conduct its analysis.
 The machine proceeds to decipher patterns found in the data through the process of trial
and error.
 The machine’s data model, formed from analyzing data patterns, can then be used to
predict future values.
 Machine learning entails a three-step process: Data > Model > Action

block diagram of the working
of Machine Learning algorithm

 Data is fed to generic algorithms


 With the help of these algorithms, machine builds the logic as per the data and predict
the output.
When do we need ML?
 Problem’s complexity
 Tasks that are too complex to be elaborated into well defined software program such as driving,
speech processing, image understanding etc
 Tasks related to analysis of very large and complex data sets such as astronomical data, turning
medical archives into medical knowledge, weather prediction, web search engine, e-commerce etc

 Need for Adaptivity


 Machine learning tools – programs whose behavior adapts to their input data – offer adaptivity.
Typical examples include programs that decode handwritten text, where a fixed program can adapt
to variations between the handwriting of different users; spam detection programs, adapting
automatically to changes in the nature of spam e-mails; and speech recognition programs.
Applications of Machine
Learning
Machine Learning Life Cycle
 A cyclic process to build an efficient machine
learning project
 Machine learning life cycle involves seven major
steps:
 Gathering Data
 Data preparation
 Data Wrangling
 Analyze Data
 Train the model
 Test the model
 Deployment
Gathering Data & Data
preparation
 Gathering Data:
 The goal of this step is to identify and obtain all data-related problems.
 Identify various data sources
 Collect data
 Integrate the data obtained from different sources
 By this step, dataset is obtained, which would be further used in the process.
Data Preparation
 Data is put into a suitable place and prepare it to use in our machine learning training.
 This step can be further divided into two processes:
 Data exploration:
 It is used to understand the nature of data that we have to work with. We need to understand the characteristics, format, and
quality of data.
 A better understanding of data leads to an effective outcome. In this, we find Correlations, general trends, and outliers.
 Data pre-processing:
 Now the next step is preprocessing of data for its analysis.
Data Wrangling
 Data wrangling is the process of cleaning and converting raw data into a useable format.
 It is the process of cleaning the data, selecting the variable to use, and transforming the
data in a proper format to make it more suitable for analysis in the next step. It is one of
the most important steps of the complete process. Cleaning of data is required to address
the quality issues.
 It handling following issues:
 Missing Values
 Duplicate data
 Invalid data
Noise
Learning
 A computer program is said to learn from experience ‘E’ with respect to some class of
tasks ‘T’ and performance measure ‘P’, if its performance at tasks in ‘T, as measured by ‘P’,
improves with experience ‘E’.

Learning Performance Training


Task ‘T’
Problem measure ‘P’ experience ‘E’

• Playing Checkers • Playing checkers • Percent of games • Playing practice


• Handwriting • Recognizing and won against games against
recognition classifying opponents itself
handwritten • Percent of words • A database of
words within correctly handwritten
images classified words with given
classification
Types of Learning
 Supervised Learning
 Unsupervised Learning
 Semi-supervised Learning
 Reinforcement Learning
Supervised Learning
 Supervised learning is the types of machine learning in which machines are trained using
well "labelled" training data, and on basis of that data, machines predict the output.
 The labelled data means some input data is already tagged with the correct output.

 In supervised learning, the training data provided to the machines work as the supervisor
that teaches the machines to predict the output correctly. It applies the same concept as a
student learns in the supervision of the teacher. Hence, the name supervised learning.
 Supervised learning is a process of providing input data as well as correct output data to
the machine learning model. The aim of a supervised learning algorithm is to find a
mapping function to map the input variable(x) with the output variable(y).
How Supervised Learning
Works?
 Data Collection and Labelling
 to collect a representative and diverse dataset
 The labeling process involves assigning the correct output label to each input example in the dataset

 Training and Test Sets


 Dataset is divided into two subsets: the training set and the test set.
 The training set is used to train the model, while the test set is used to evaluate its performance on unseen data.
 The training set serves as the basis for the model to learn patterns and relationships between the input features and the output labels. The
test set, on the other hand, helps assess the model’s generalization ability and its performance on new, unseen data.

 Feature Extraction
 The relevant features are extracted from the input data.
 Feature extraction involves selecting or transforming the input features to capture the most relevant information for the learning task to
improve performance and reduce complexity.

 Model Selection and Training


 Choosing an appropriate machine learning algorithm is crucial for the success of supervised learning.
 Once the algorithm is selected, the model is trained using the labeled training data.

 Prediction and Evaluation


 Once the model is trained, it can be used to make predictions on new, unseen data. The input features of the unseen data are fed into the
trained model, which generates predictions or classifications based on the learned patterns.
 To evaluate the model’s performance, the predicted outputs are compared against the true labels of the unseen data.
Supervised Learning (Example)
Types of Supervised
Learning

 Regression
 Regression algorithms are used if there is a relationship between the input variable and the output variable.
 In regression tasks, the machine learning program must estimate – and understand – the relationships
among variables.
 Classification
 Classification algorithms are used when the output variable is categorical, which means there are two classes such
as Yes-No, Male-Female, True-false, etc.
In classification tasks, the machine learning program must draw a conclusion from observed values and determine
to what category new observations belong
Supervised Learning
Algorithms
Some of the most popularly used supervised learning algorithms are:
 Linear Regression
 Logistic Regression
 Support Vector Machine
 K Nearest Neighbor
 Decision Tree
 Random Forest
 Naive Bayes
Merits & Demerits of
Supervised Learning
 Merits :
 With the help of supervised learning, the model can predict the output on the basis of prior
experiences.
 In supervised learning, we can have an exact idea about the classes of objects.
 Supervised learning model helps us to solve various real-world problems such as fraud detection,
spam filtering, etc.

 Disadvantages of supervised learning:


 Supervised learning models are not suitable for handling the complex tasks.
 Supervised learning cannot predict the correct output if the test data is different from the training
dataset.
 Training required lots of computation times.
 In supervised learning, we need enough knowledge about the classes of object.
Unsupervised Learning
 Uses unlabeled data to train machines and act on data without any
supervision.
 The model learns from the data, discovers the patterns and features in the
data, and returns the output. As it assesses more data, its ability to make
decisions on that data gradually improves and becomes more refined.
 The goal of unsupervised learning is to find the underlying structure of
dataset, group that data according to similarities, and represent that
dataset in a compressed format.
Why Unsupervised Learning
 Unsupervised learning is helpful for finding useful insights from the data.
 Unsupervised learning is much similar as a human learns to think by their
own experiences, which makes it closer to the real AI.
 Unsupervised learning works on unlabeled and uncategorized data which
make unsupervised learning more important.
 In real-world, we do not always have input data with the corresponding
output so to solve such cases, we need unsupervised learning.
Working of Unsupervised
Learning
 Unlabeled input data are taken as input raw data
 This unlabeled input data is fed to the machine learning model in order to train it.
 Firstly, the model will interpret the raw data to find the hidden patterns from the data and
then will apply suitable algorithms such as k-means clustering, Decision tree, etc. to the
input data
 Once it applies the suitable algorithm, the algorithm divides the data objects into groups
according to the similarities and difference between the objects.
Types of Unsupervised
Learning Algorithm
Unsupervised Learning

Dimensionality Association Rule


Clustering Reduction Mining
Clustering
 Clustering involves grouping sets of similar data (based on defined criteria). It’s useful for
segmenting data into several groups and performing analysis on each data set to find patterns.
 Example: K Means Clustering, Hierarchical Clustering, DBSCAN (Density-Based Spatial Clustering
of Applications with Noise)
 Dimension reduction
 Dimensionality reduction techniques are used to reduce the number of input variables or features
while retaining meaningful information.
 Example: Principal Component Analysis
 An association rule is an unsupervised learning method which is used for finding the
relationships between variables in the large database. It determines the set of items that
occurs together in the dataset. Association rule makes marketing strategy more effective.
 Association rule mining focuses on discovering interesting relationships or patterns in
transactional data. It is commonly used in market basket analysis and recommendation
systems. The widely used algorithm for association rule mining is the Apriori algorithm.
Merits & Demerits of
Unsupervised Learning
 Advantages of Unsupervised Learning
Unsupervised learning is used for more complex tasks as compared to supervised
learning because, in unsupervised learning, we don't have labeled input data.
Unsupervised learning is preferable as it is easy to get unlabeled data in comparison to
labeled data.

Disadvantages of Unsupervised Learning


Unsupervised learning is intrinsically more difficult than supervised learning as it does
not have corresponding output.
The result of the unsupervised learning algorithm might be less accurate as input data is
not labeled, and algorithms do not know the exact output in advance.
Semi-supervised Learning
 Semi-supervised learning is similar to supervised learning, but instead uses both labelled and
unlabeled data. By using this combination, machine learning algorithms can learn to label
unlabeled data.
 It is a method that uses a small amount of labeled data and a large amount of
unlabeled data to train a model.
The goal of semi-supervised learning is to learn a function that can accurately predict
the output variable based on the input variables, similar to supervised learning.
However, unlike supervised learning, the algorithm is trained on a dataset that contains
both labeled and unlabeled data.
 Semi-supervised learning is particularly useful when there is a large amount of
unlabeled data available, but it’s too expensive or difficult to label all of it.
Semi-supervised learning
flowchart
Semi-supervised learning
Examples
Text classification: In text classification, the goal is to classify a given text into one or more
predefined categories. Semi-supervised learning can be used to train a text classification
model using a small amount of labeled data and a large amount of unlabeled text data.
 Image classification: In image classification, the goal is to classify a given image into one
or more predefined categories. Semi-supervised learning can be used to train an image
classification model using a small amount of labeled data and a large amount of unlabeled
image data.
Anomaly detection: In anomaly detection, the goal is to detect patterns or observations
that are unusual or different from the norm
Reinforcement Learning
 Reinforcement Learning is a feedback-based
Machine learning technique in which an agent learns
to behave in an environment by performing the
actions and seeing the results of actions. For each
good action, the agent gets positive feedback, and for
each bad action, the agent gets negative feedback or
penalty.
 Reinforcement learning teaches the machine trial
and error. It learns from past experiences and begins
to adapt its approach in response to the situation to
achieve the best possible result.
 Reinforcement learning problems are reward-based.
For every task or for every step completed, there will
be a reward received by the agent. If the task is not
achieved correctly, there will be some penalty added.
 RL solves a specific type of problem where decision making is sequential, and the goal is
long-term, such as game-playing, robotics, etc.
 The agent interacts with the environment and explores it by itself. The primary goal of an
agent in reinforcement learning is to improve the performance by getting the maximum
positive rewards.
 Reinforcement learning is a type of machine learning method where an intelligent agent
(computer program) interacts with the environment and learns to act within that.
Example: AI agent present within a
maze environment, to find the diamond
 The agent interacts with the environment by
performing some actions, and based on those actions,
the state of the agent gets changed, and it also
receives a reward or penalty as feedback.
 The agent continues doing these three things (take
action, change state/remain in the same state, and
get feedback), and by doing these actions, he learns
and explores the environment.
 The agent learns that what actions lead to positive
feedback or rewards and what actions lead to
negative feedback penalty. As a positive reward, the
agent gets a positive point, and as a penalty, it gets a
negative point.
Key Features of RL
 In RL, the agent is not instructed about the environment and what actions need to be
taken.
 It is based on the hit and trial process.
 The agent takes the next action and changes states according to the feedback of the
previous action.
 The agent may get a delayed reward.
 The environment is stochastic, and the agent needs to explore it to reach to get the
maximum positive rewards.
RL types
Reinforcement Learning

Positive Negative
Reinforcement Reinforcement

 The POSITIVE REINFORCEMENT learning means adding something to increase the


tendency that expected behavior would occur again. It impacts positively on the behavior of
the agent and increases the strength of the behavior.
 The NEGATIVE REINFORCEMENT learning is opposite to the positive reinforcement as it
increases the tendency that the specific behavior will occur again by avoiding the negative
condition.
Reinforcement Learning
Algorithms
Some of the important reinforcement learning algorithms are:
 Q-learning
 Sarsa
 Monte Carlo
 Deep Q network

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy