0% found this document useful (0 votes)
28 views34 pages

Learning AI

it is about how an agent got learned with aritrtificial intelligences

Uploaded by

amaraamjad694
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views34 pages

Learning AI

it is about how an agent got learned with aritrtificial intelligences

Uploaded by

amaraamjad694
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Learnin

g
In which we describe agents that can improve their
behavior through diligent study of their own experiences.

Lecture 28-Learning & Classification 1


Outli
ne
• Learning
• Supervised Learning
• Unsupervised Learning
• Naïve Bayes Algorithm

Lecture 28-Learning & Classification 2


Learni
• AI systems are complex and may have many
ngparameters.

• It is impractical and often impossible to encode all


the knowledge a system needs.

• Different types of data may require very different


parameters.

• Instead of trying to hard code all the knowledge, it


makes sense to learn it.
• Learning can range from the trivial, as exhibited
by jotting down a phone number, to the profound,
as exhibited by Albert Einstein
Lecture 28-Learning & Classification 3
Machine Learning
is…
Machine learning is about predicting the future based on the
past.

past future
ic t
rn ed
le a pr
Training model/ Testing model/
Data predictor Data predictor
Why to
Learn....
• To improve design of an agent
• Designers cannot predefine all situations
• Designers cannot predefine all changes over time
• Sometimes human programmers have no idea
how to program a solution themselves
Forms of
Learning
Any component of an agent can be improved by
learning(factors)
from data. The improvements, and the
techniques used to make them, depend on four
major factors:
a. Which component is to be improved.
b. What prior knowledge the agent already has.
c. What representation is used for the data and the component.
d. What feedback is available to learn from.
Forms/Types of Learning
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
Learning from
• Supervised Learning – learn a function from a set of
Observations
training examples which are preclassified feature
vectors.
• Agent observes some example input–output pairs
and learns a function that maps from input to
output (Classification).
feature clas
vector s Given a previously
(shape,colo I unseen feature vector,
r) (square, what is the rule that
red) (square, tells us if it is in class I
blue) (circle, I or class II?
red) (circle (circle, ?
blue) II
green) ?
(triangle, red)
(triangle, II (triangle,
Lecture 28-Learning & Classification 8
blue)
Classification:
predicts clear-cut class labels
classifies data (constructs a model) based on the training set and the
values (class labels) in a classifying attribute and uses it in classifying
new data
Type of Supervised learning
Typical Applications
credit approval
target marketing
medical diagnosis
treatment effectiveness analysis
Precision agriculture (yield prediction)
• Classification:
• predicts clear-cut class labels
• classifies data (constructs a model) based on the training set and
the values (class labels) in a classifying attribute and uses it in
classifying new data
• Type of Supervised learning
• Typical Applications
• credit approval
• target marketing
• medical diagnosis
• treatment effectiveness analysis
• Precision agriculture (yield prediction)
Classification—A Three-Step
Process
1. Classifier/Model construction
Each Data sample is assumed to belong to a predefined class, as determined by one of
the attributes (may be composite), designated as the class label attribute.

A set of tuples with class labels used for model construction: training set

The CLASSIFIER is represented as classification rules, decision trees, or mathematical


formula that maps input data to its class

2. Test Classifier
A set of tuples with class labels is used to test the Classifier

3. Use Classifier
for classifying future or unknown objects.
Classification Process (1): Model Construction
Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier


Mike Assistant Prof 3 no (Model)
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes
IF rank = ‘professor’
Dave Assistant Prof 6 no
Anne Associate Prof 3 no
OR years > 6
THEN tenured = ‘yes’
Test (2) and Classification Process (3)

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom Assistant Prof 2 no Tenured?
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
Classification by Decision Tree
Induction
Decision tree
• Flow-chart-like tree structure of classification of objects
• Takes as input a vector of attribute values and returns a “decision”
• a single output value
• Internal node denotes a test on an attribute
• Branch represents an outcome of the test
• Leaf nodes represent class labels or class distribution
• Boolean classification, where each POSITIVE example input will be classified as
true (a positive example) or false (a negative example)
Decision tree generation consists of two phases
Tree construction (TRAINING phase)
At start, all the training examples are at the root
Partition examples recursively based on selected attributes
Tree pruning
Identify and remove branches that reflect noise or outliers
Use of decision tree: Classifying an unknown sample
Test the attribute values of the sample against the decision tree
Training Dataset
(“buys_computer”)
age income student credit_rating Buys
<=30 high no fair
<=30 high no excellent
31…40 high no fair Yes
>40 medium no Fair Yes
>40 low yes Fair yes
>40 low yes excellent
31…40 low yes excellent
<=30 medium no fair
<=30 low yes Fair Yes
>40 medium yes Fair Yes
<=30 medium yes excellent
31…40 medium no Excellent Yes
31…40 high yes Fair Yes
>40 medium no excellent
Output: A Decision Tree for “buys_computer”

age?

<=30 overcast
31..40 >40

student? yes credit rating?

no yes excellent fair

no yes no yes
Algorithm for decision
Basic algorithm (a greedy divide-and-conquer algorithm)
tree learning
• Assume attributes are categorical now (continuous attributes can be
handled too)
• Tree is constructed in a top-down recursive manner
• At start, all the training examples are at the root
• Examples are partitioned recursively based on selected attributes
• Attributes are selected on the basis of an impurity function (e.g.,
information gain/entropy)
Conditions for stopping partitioning
• All examples for a given node belong to the same class
• There are no remaining attributes for further partitioning – majority
class is the leaf
• There are no examples left
A decision tree from the loan
data
 Decision nodes and leaf nodes (classes)
Use the decision tree

No
Is the decision tree


unique?
No. Here is a simpler tree.
We want smaller tree and accurate tree.

Easy to understand and perform better.

 Finding the best tree is


NP-hard.
 All current tree building
algorithms are heuristic
algorithms
Two possible roots,
which is better?

 Fig. (B) seems to be better.


Information theory:
•Entropy measure
Measure of the amount of uncertainty or
randomness in data
• Tells us about the predictability of a certain event
• Example, consider a coin toss whose probability of
heads is 0.5 and probability of tails is 0.5
• Less entropy considered to be good
• High entropy means more noise or poor
classification
Entropy
measure
: let us
get a
feeling
 As the data become purer and purer, the entropy value
becomes smaller and smaller. This is useful to us!
Learning from
Observations
• Unsupervised Learning – No classes are given. The
idea is to find patterns in the data. This generally
involves clustering.
• Reinforcement Learning – learn from feedback after
a decision is made.

Lecture 28-Learning & Classification 2


4
Clusterin
g entities together.
•Process of grouping similar

•Goal is to find similarities in the data point and group similar


data points together.

•Used in many fields, including machine learning, pattern


recognition, robotics, computer vision and image analysis.

•Given:
• Data Set D (training set)
• Similarity/distance metric
•Find:
• Partitioning of data
• Groups of similar/close items
2
5
Example
Similarit
Groups of similary?
customers
Similar demographics/population
Similar buying behavior
Similar health
Similar products
Similar cost
Similar function
Similar store

In simple words, the aim is to separate groups with
similar traits and assign them into clusters.
Hard vs. soft
clustering
Hard clustering: Each document/Sample belongs to exactly
one cluster
More common and easier to do
Each sample is put into one group out of the many groups
Soft clustering: A document can belong to more than one
cluster based on probabilities
Makes more sense for applications like creating browsable
hierarchies
You may want to put a pair of sneakers in two clusters: (i) sports
apparel and (ii) shoes
Naive Bayes
Classifier
• Ever wondered how magically your emails are
classified as spam or not spam?
• This algorithm that we are going to talk about today
is the magic behind that.
• Naive Bayes is a supervised Machine Learning
algorithm inspired by the Bayes theorem that you
must have studied in mathematics.
• It works on the principles of conditional
probability.

Lecture 28-Learning & Classification 29


Naïve Bayes
Classifier

• It is the heart of the Naive Bayes Algorithm.


• An overview of the formula, it is the representation
of Bayes theorem where:
• A and B are two events,
• P(A) and P(B) is the probability of occurrence of event A
and event B respectively regardless of each other and
• P(B|A) is the conditional probability that event B occurs
,
given that A has occurred.
Lecture 28-Learning & Classification 30
Exam
ple
• Given all previous patients I’ve seen (below are
their symptoms & diagnosis)…

• Do I believe that a patient with following symptoms


has flu?

Lecture 28-Learning & Classification 31


Advantages of Naive
Bayes
• Does not require a lot of training data
• It is super fast
• Simple and easy to implement
• Handles both continuous and Discrete data

Lecture 28-Learning & Classification 32


Disadvantages of
Naive Bayes
• Makes very strong assumption on conditional
independence
• Requires laplace correction in case of 0 probability
of an attribute

Lecture 28-Learning & Classification 33


End

Lecture 28-Learning & Classification 34

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy