0% found this document useful (0 votes)

5 views

Machine Learning and Data Mining

The document discusses machine learning concepts including the learning problem, prediction, inference, parametric and non-parametric methods. It provides examples to illustrate key machine learning concepts like predicting house prices or donations based on characteristics and determining impact of variables. The trade-off between predictive accuracy and interpretability of models is also covered.

Uploaded by

julius padi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

Machine Learning and Data Mining

Uploaded by

julius padi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 88

DCIT 313

Introduction to Artificial
Intelligence

Session 6 – Machine Learning and Data Mining

Course Writer: Dr Kofi Sarpong Adu-Manu

Contact Information: ksadu-manu@ug.edu.gh

CBAS
School of Physical and Mathematical Sciences
2020/2021 – 2022/2023
Subsets of Artificial Intelligence
Lesson Goals:
• Understand the basic concepts of the learning problem and why/how machine
learning methods are used to learn from data to find underlying patterns for
prediction and decision-making.

• Understand the learning algorithm trade-offs, balancing performance within

training data and robustness on unobserved test data.

• Differentiate between supervised and unsupervised learning methods as well as

regression versus classification methods.

• Understand the basic concepts of assessing model accuracy and the bias-variance
trade-off.
Introduction
• Machine learning is making great strides
– Large, good data sets
– Compute power
– Progress in algorithms
• Many interesting applications
– commericial
– scientific
• Links with artificial intelligence
– However, AI  machine learning

4
Big Data is Everywhere
• We are in the era of big data!
– 40 billion indexed web pages
– 100 hours of video are uploaded
to YouTube every minute
• The deluge of data calls for
automated methods of data
analysis, which is what
machine learning provides!
What is Machine Learning?
• Machine learning is a set of methods that can
automatically detect patterns in data.

• These uncovered patterns are then used to predict

future data, or to perform other kinds of decision-
making under uncertainty.

• The key premise is learning from data!!

What is Machine Learning?
• Addresses the problem of analyzing huge bodies of data so
that they can be understood.

• Providing techniques to automate the analysis and

exploration of large, complex data sets.

• Tools, methodologies, and theories for revealing patterns

in data – critical step in knowledge discovery.
What is Machine Learning?
• Driving Forces:
– Explosive growth of data in a great variety of fields
• Cheaper storage devices with higher capacity
• Faster communication
• Better database management systems
– Rapidly increasing computing power

• We want to make the data work for us!!

Examples of Learning Problems
• Machine learning plays a key role in many areas of science, finance
and industry:
– Predict whether a patient, hospitalized due to a heart attack, will have a
second heart attack. The prediction is to be based on demographic, diet and
clinical measurements for that patient.
– Predict the price of a stock in 6 months from now, on the basis of company
performance measures and economic data.
– Identify the numbers in a handwritten ZIP code, from a digitized image.
– Estimate the amount of glucose in the blood of a diabetic person, from the
infrared absorption spectrum of that person’s blood.
– Identify the risk factors for prostate cancer, based on clinical and
demographic variables.
Research Fields
• Statistics / Statistical Learning
• Data Mining
• Pattern Recognition
• Artificial Intelligence
• Databases
• Signal Processing
Applications
• Business
– Walmart data warehouse mined for advertising and logistics
– Credit card companies mined for fraudulent use of your card
based on purchase patterns
– Netflix developed movie recommender system
• Genomics
– Human genome project: collection of DNA sequences,
microarray data
Applications (cont.)
• Information Retrieval
– Terrabytes of data on internet, multimedia information
(video/audio files)

• Communication Systems
– Speech recognition, image analysis
The Learning Problem
• Learning from data is used in situations where we
don’t have any analytic solution, but we do have data
that we can use to construct an empirical solution

• The basic premise of learning from data is the use of a

set of observations to uncover an underlying process.
The Learning Problem (cont.)
• Suppose we observe the output space and the input space
• We believe that there is a relationship between Y and at
least one of the X’s.
• We can model the relationship as:
where f is an unknown function and ε is a random error
(noise) term, independent of X with mean zero.
The Learning Problem (cont.)
The Learning Problem: Example

0.10
0.05
0.00
y

-0.05
-0.10

0.0 0.2 0.4 0.6 0.8 1.0

x
The Learning Problem: Example (cont.)
The Learning Problem: Example (cont.)

• Different estimates for the target function f that depend on

the standard deviation of the ε’s
Why do we estimate f?
• We use modern machine learning methods to estimate f by
learning from the data.
• The target function f is unknown.
• We estimate f for two key purposes:
– Prediction
– Inference
Prediction
• By producing a good estimate for f where the variance of ε
is not too large, then we can make accurate predictions for
the response variable, Y, based on a new value of X.
• We can predict Y using (X)
where represents our estimate for f, and represents the
resulting prediction for Y.
Prediction (cont.)
• The accuracy of as a prediction for Y depends on:
– Reducible error
– Irreducible error

• Note that will not be a perfect estimate for f; this

inaccuracy introduces error.
Prediction (cont.)
• This error is reducible because we can potentially
improve the accuracy of the estimated (i.e. hypothesis)
function by using the most appropriate learning
technique to estimate the target function f.
• Even if we could perfectly estimate f, there is still
variability associated with ε that affects the accuracy of
predictions = irreducible error.
Prediction (cont.)
• Average of the squared difference between the
predicted and actual value of Y.
• Var(ε) represents the variance associated with ε.

• Our aim is to minimize the reducible error!!

Example: Direct Mailing Prediction
 We are interested in predicting how much money an individual will
donate based on observations from 90,000 people on which we have
recorded over 400 different characteristics.
 We don’t care too much about each individual characteristic.
 Learning Problem:
 For a given individual, should I send out a mailing?
Inference
• Instead of prediction, we may also be interested in the
type of relationship between Y and the X’s.
• Key questions:
– Which predictors actually affect the response?
– Is the relationship positive or negative?
– Is the relationship a simple linear one or is it more complicated?
Example: Housing Inference
• We wish to predict median house price based on
numerous variables.
• We want to learn which variables have the largest effect
on the response and how big the effect is.
• For example, how much impact does the number of
bedrooms have on the house value?
How do we estimate f?
• First, we assume that we have observed a set of training
data.

{( X1 , Y1 ), ( X 2 , Y2 ), , ( X n , Yn )}
• Second, we use the training data and a machine learning
method to estimate f.
– Parametric or non-parametric methods
Parametric Methods
• This reduces the learning problem of estimating the target
function f down to a problem of estimating a set of
parameters.

• This involves a two-step approach…

Parametric Methods (cont.)
• Step 1:
– Make some assumptions about the functional form of f. The
most common example is a linear model:

f (X i )   0  1 X i1   2 X i 2     p X ip
– In this course, we will examine far more complicated and
flexible models for f.
Parametric Methods (cont.)
• Step 2:
– We use the training data to fit the model (i.e. estimate f….the
unknown parameters).
– The most common approach for estimating the parameters in a
linear model is via ordinary least squares (OLS) linear
regression.
– However, there are superior approaches, as we will see in this
course.
Example: Income vs. Education Seniority
Example: OLS Regression Estimate
• Even if the standard deviation is low, we will still get a
bad answer if we use the incorrect model.
Non-Parametric Methods
• As opposed to parametric methods, these do not make
explicit assumptions about the functional form of f.
• Advantages:
– Accurately fit a wider range of possible shapes of f.
• Disadvantages:
– Requires a very large number of observations to acquire an
accurate estimate of f.
Example: Thin-Plate Spline Estimate
• Non-linear regression
methods are more flexible
and can potentially provide
more accurate estimates.

• However, these methods can

run the risk of over-fitting the
data (i.e. follow the errors, or
noise, too closely), so too
much flexibility can produce
poor estimates for f.
Predictive Accuracy vs. Interpretability

• Conceptual Question:
– Why not just use a more flexible method if it is more realistic?

• Reason 1:
– A simple method (such as OLS regression) produces a model
that is easier to interpret (especially for inference purposes).
Predictive Accuracy vs. Interpretability (cont.)

• Reason 2:
– Even if the primary
purpose of learning from
the data is for prediction, it
is often possible to get
more accurate predictions
with a simple rather than a
complicated model.
Learning Algorithm Trade-off
• There are always two aspects to consider when designing
a learning algorithm:
– Try to fit the data well
– Be as robust as possible

• The predictor that you have generated using your training

data must also work well on new data.
Learning Algorithm Trade-off (cont.)
• When we create predictors, usually the simpler the
predictor is, the more robust it tends to be in the sense of
begin able to be estimated reliably.

• On the other hand, the simple models do not fit the

training data aggressively.
Learning Algorithm Trade-off (cont.)
• Training Error vs. Testing Error:
– Training error  reflects whether the data fits well
– Testing error  reflects whether the predictor actually works on
new data

• Bias vs. Variance:

– Bias  how good the predictor is, on average; tends to be smaller
with more complicated models
– Variance  tends to be higher for more complex models
Learning Algorithm Trade-off (cont.)
• Fitting vs. Over-fitting:
– If you try to fit the data too aggressively, then you may over-fit
the training data. This means that the predictors works very well
on the training data, but is substantially worse on the unseen test
data.
• Empirical Risk vs. Model Complexity:
– Empirical risk  error rate based on the training data
– Increase model complexity = decrease empirical risk but less
robust (higher variance)
Learning Spectrum
Supervised vs. Unsupervised Learning

• Supervised Learning:
– All the predictors, Xi, and the response, Yi, are observed.
• Many regression and classification methods

• Unsupervised Learning:
– Here, only the Xi’s are observed (not Yi’s).
– We need to use the Xi’s to guess what Y would have been, and
then build a model form there.
• Clustering and principal components analysis
Terminology
• Notation
– Input X: feature, predictor, or independent variable
– Output Y: response, dependent variable
• Categorization
– Supervised learning vs. unsupervised learning
• Key question: Is Y available in the training data?
– Regression vs. Classification
• Key question: Is Y quantitative or qualitative?
Terminology (cont.)
• Quantitative:
– Measurements or counts, recorded as numerical values (e.g.
height, temperature, etc.)

• Qualitative: group or categories

– Ordinal: possesses a natural ordering (e.g. shirt sizes)
– Nominal: just name the categories (e.g. marital status, gender,
etc.)
Terminology (cont.)
Supervised Learning
Supervised Learning:
Regression vs. Classification
• Regression
– Covers situations where Y is continuous (quantitative)
– E.g. predicting the value of the Dow in 6 months, predicting the
value of a given house based on various inputs, etc.

• Classification
– Covers situations where Y is categorical (qualitative)
– E.g. Will the Dow be up or down in 6 months? Is this email spam
or not?
Supervised Learning: Examples
• Email Spam:
– predict whether an email is a junk email (i.e. spam)
Supervised Learning: Examples
• Handwritten Digit Recognition:
– Identify single digits 0~9 based on images
Supervised Learning: Examples
• Face Detection/Recognition:
– Identify human faces
Supervised Learning: Examples
• Speech Recognition:
– Identify words spoken according to speech signals
• Automatic voice recognition systems used by airline companies,
automatic stock price reporting, etc.
Supervised Learning:
Linear Regression
Supervised Learning:
Linear/Quadratic Discriminant Analysis
Supervised Learning:
Logistic Regression
Supervised Learning:
K Nearest Neighbors
Supervised Learning:
Decision Trees / CART
Supervised Learning:
Support Vector Machines
Unsupervised Learning
Unsupervised Learning (cont.)
• The training data does not contain any output information
at all (i.e. unlabeled data).

• Viewed as the task of spontaneously finding patterns and

structure in input data.

• Viewed as a way to create a higher-level representation of

the data and dimension reduction.
Unsupervised Learning:
K-Means Clustering
Unsupervised Learning:
Hierarchical Clustering
Assessing Model Accuracy
• For a given set of data, we need to decide which machine
learning method produces the best results.

• We need some way to measure the quality of fit (i.e. how

well its predictions actually match the observed data).

• In regression, we typically use mean squared error (MSE).

Assessing Model Accuracy (cont.)
Assessing Model Accuracy (cont.)
• Thus, we really care about how well the method
works on new, unseen test data.

• There is no guarantee that the method with the

smallest training MSE will have the smallest test
MSE.
Training vs. Test MSEs
• In general, the more flexible a method is the lower its
training MSE will be.

• However, the test MSE may in fact be higher for a more

flexible method than for a simple approach like linear
regression.
Training vs. Test MSEs (cont.)
• More flexible methods (such as splines) can generate a
wider range of possible shapes to estimate f as
compared to less flexible and more restrictive methods
(such as linear regression).

• The less flexible the method, the easier to interpret the

model. there is a trade-off between flexibility and
model interpretability.
Different Levels of Flexibility

Overfitting
Different Levels of Flexibility (cont.)
Different Levels of Flexibility (cont.)
Bias-Variance Trade-off
• The previous graphs of test versus training MSEs
illustrates a very important trade-off that governs the
choice of machine learning methods.

• There are always two competing forces that govern the

choice of learning method:
– bias and variance
Bias of Learning Methods
• Bias refers to the error that is introduced by modeling a
real life problem (that is usually extremely complicated)
by a much simpler problem.

• Generally, the more flexible/complex a machine learning

method is, the less bias it will generally have.
Variance of Learning Methods
• Variance refers to how much your estimate for f would
change by if you had a different training data set.

• Generally, the more flexible/complex a machine learning

method is the more variance it has.
The Trade-Off: Expected Test MSE
Test MSE, Bias and Variance
• Thus, in order to minimize the expected test MSE, we
must select a machine learning method that
simultaneously achieves low variance and low bias.

• Note that the expected test MSE can never lie below the
irreducible error - Var(ε).
Test MSE, Bias and Variance (cont.)
The Classification Setting
• For a classification problem, we can use the
misclassification error rate to assess the accuracy of the
machine learning method.
n
Error Rate   I ( yi  yˆ i ) / n
which represents the fraction i 1 of misclassifications.

• is an indicator function, which will give 1 if

Ithe  yˆ i )
( yi condition is correct, otherwise it gives a 0.
( yi  yˆ i )
Bayes Error Rate
• The Bayes error rate refers to the lowest possible error rate
that could be achieved if somehow we knew exactly what
the “true” probability distribution of the data looked like.

• On test data, no classifier can get lower error rates than the
Bayes error rate.

• In real-life problems, the Bayes error rate can’t be

calculated exactly.
Bayes Decision Boundary
• The purple dashed line
represents the points
where the probability is
exactly 50%.

• The Bayes classifier’s

prediction is determined
by the Bayes decision
boundary
K-Nearest Neighbors (KNN)
• KNN is a flexible approach to estimate the Bayes classifier.

• For any given X, we find the k closest neighbors to X in the

training data and average their corresponding responses Y.

• If the majority of the Y’s are orange, then we predict orange

otherwise guess blue.

• The smaller that k is, the more flexible the method will be.
KNN: K=10

KNN decision
boundary

Bayes decision
boundary
KNN: K=1 and K=100

Low Bias, High Variance High Bias, Low Variance

Overly Flexible Less Flexible
KNN Training vs. Test Error Rates

• Notice that the KNN

training error rates (blue)
keep going down as k
decreases (i.e. as the
flexibility increases).

• However, note that the

KNN test error rate at
first decreases but then
starts to increase again.
Key Note: Bias-Variance Trade-Off
Underfit Overfit • In general, training errors
will always decline.

• However, test errors will

decline at first (as
reductions in bias
dominate) but will then
When selecting a machine learning start to increase again (as
method, remember that more increases in variance
flexible/complex is not necessarily better!! dominate).
Summary
Activity
Reference
Ertel, W., & Black, N. T. (2011). Introduction to
Artificial Intelligence. Berlin: Springer.
Acknowledgement

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
The Exchange Game
No ratings yet
The Exchange Game
3 pages
GD&T Questions
100% (1)
GD&T Questions
12 pages
Soluciones Al Hartshorne PDF
No ratings yet
Soluciones Al Hartshorne PDF
51 pages
Intro To Data Science Lecture 1
No ratings yet
Intro To Data Science Lecture 1
7 pages
Machine Learning Shortnote
No ratings yet
Machine Learning Shortnote
14 pages
ML-1-PPT-UNIT-1
No ratings yet
ML-1-PPT-UNIT-1
93 pages
Ch2_Statistical_Learning
No ratings yet
Ch2_Statistical_Learning
51 pages
DS-05 Introduction To Machine Learning
No ratings yet
DS-05 Introduction To Machine Learning
103 pages
MIT - Machine Learning Notes From Chapter 1 - 14 PDF
No ratings yet
MIT - Machine Learning Notes From Chapter 1 - 14 PDF
101 pages
This Story Paraphrased From A Post On 9/4/12
No ratings yet
This Story Paraphrased From A Post On 9/4/12
7 pages
Machinelearning
No ratings yet
Machinelearning
59 pages
Regression 0
No ratings yet
Regression 0
108 pages
Brief Summary ML
No ratings yet
Brief Summary ML
25 pages
Week 4 - Intro to ML
No ratings yet
Week 4 - Intro to ML
37 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Ai Unit5 Learning
No ratings yet
Ai Unit5 Learning
62 pages
Summer of Science-Final Report
100% (1)
Summer of Science-Final Report
7 pages
Chapter Introduction
No ratings yet
Chapter Introduction
7 pages
1 - Intro to Machine Learning
No ratings yet
1 - Intro to Machine Learning
34 pages
Statlearn PDF
No ratings yet
Statlearn PDF
123 pages
BTMMeeting25Nov2020-StatisticalLearning
No ratings yet
BTMMeeting25Nov2020-StatisticalLearning
49 pages
ML_Valkenborg
No ratings yet
ML_Valkenborg
84 pages
ML_Introduction
No ratings yet
ML_Introduction
76 pages
ML IMP QUES 1
No ratings yet
ML IMP QUES 1
22 pages
Machine Learning Notes
100% (3)
Machine Learning Notes
134 pages
ML 01
No ratings yet
ML 01
24 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
39 pages
QSRI-lecture1
No ratings yet
QSRI-lecture1
45 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
Module 3-1
No ratings yet
Module 3-1
7 pages
Module 03 - Learners Guide
No ratings yet
Module 03 - Learners Guide
13 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Linear Regression for ML ass
No ratings yet
Linear Regression for ML ass
99 pages
CS601_Machine Learning_Unit 1_Notes_1672759748
No ratings yet
CS601_Machine Learning_Unit 1_Notes_1672759748
13 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
Intro To Machine Learning With PyTorch
No ratings yet
Intro To Machine Learning With PyTorch
48 pages
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
No ratings yet
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
18 pages
Unit 2
No ratings yet
Unit 2
76 pages
Anintroductiontomachinelearning: Michaelclark Centerforsocialresearch Universityofnotredame
No ratings yet
Anintroductiontomachinelearning: Michaelclark Centerforsocialresearch Universityofnotredame
43 pages
Day 2. Lecture - Machinelearning
No ratings yet
Day 2. Lecture - Machinelearning
32 pages
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
No ratings yet
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
10 pages
UNIT II Part-1
No ratings yet
UNIT II Part-1
59 pages
AIYA SESSION 4
No ratings yet
AIYA SESSION 4
42 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
First Cours 2
No ratings yet
First Cours 2
42 pages
ML-2-PPT-UNIT-2
No ratings yet
ML-2-PPT-UNIT-2
214 pages
Machine Learning Concepts
No ratings yet
Machine Learning Concepts
68 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Statistical Prediction and Machine Learning
100% (2)
Statistical Prediction and Machine Learning
314 pages
3. LR, decision tree
No ratings yet
3. LR, decision tree
48 pages
Week2 StatisticalLearning
No ratings yet
Week2 StatisticalLearning
46 pages
Statistical Learning for Biomedical Data Updated Edition Download
100% (9)
Statistical Learning for Biomedical Data Updated Edition Download
14 pages
ML Summary PDF
No ratings yet
ML Summary PDF
5 pages
2.0 Machine Learning Introduction
No ratings yet
2.0 Machine Learning Introduction
24 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Chapter 2
No ratings yet
Chapter 2
38 pages
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Data Science for Decision Makers: Enhance your leadership skills with data science and AI expertise
From Everand
Data Science for Decision Makers: Enhance your leadership skills with data science and AI expertise
Jon Howells
No ratings yet
maths-class-viii-practice-test-04-chapter-04-data-handlings
No ratings yet
maths-class-viii-practice-test-04-chapter-04-data-handlings
3 pages
CHAPTER 6 KINETICS OF RIGID BODIE (3D) - Angular Momentum and Kinetic Energy - PDF
No ratings yet
CHAPTER 6 KINETICS OF RIGID BODIE (3D) - Angular Momentum and Kinetic Energy - PDF
40 pages
Naïve Baye's Classification
No ratings yet
Naïve Baye's Classification
16 pages
EIR 221 Chapter 9 Sinusoids and Phasors
No ratings yet
EIR 221 Chapter 9 Sinusoids and Phasors
24 pages
Classification of Adaptive Control Techniques
No ratings yet
Classification of Adaptive Control Techniques
4 pages
Review of Tegmark's MUH Book
No ratings yet
Review of Tegmark's MUH Book
3 pages
Garment
No ratings yet
Garment
10 pages
Tamil Ace
No ratings yet
Tamil Ace
30 pages
Vector and Complex Calculus A Textbook For Students of The Physical Sciences. Fabian Walee
100% (3)
Vector and Complex Calculus A Textbook For Students of The Physical Sciences. Fabian Walee
392 pages
BladedUser Manual
No ratings yet
BladedUser Manual
200 pages
IMO Longlists
No ratings yet
IMO Longlists
7 pages
Railway Security System Based On Wireless Sensor Networks: State of The Art
No ratings yet
Railway Security System Based On Wireless Sensor Networks: State of The Art
4 pages
Accenture Logical & Reasoning Section
No ratings yet
Accenture Logical & Reasoning Section
22 pages
OS - 2 Marks With Answers
No ratings yet
OS - 2 Marks With Answers
28 pages
Vibration Performance of Composite Floors Using Slim Floor Beams
No ratings yet
Vibration Performance of Composite Floors Using Slim Floor Beams
14 pages
CSE 102L Data Structures and Algorithms Lab (Common For B.Tech EEE, ECE, EI) Cycle Sheet - 1
0% (1)
CSE 102L Data Structures and Algorithms Lab (Common For B.Tech EEE, ECE, EI) Cycle Sheet - 1
4 pages
Scheme and Detailed Syllabus: M.Tech. Cyber Security
No ratings yet
Scheme and Detailed Syllabus: M.Tech. Cyber Security
8 pages
Sequences and Series PDF
No ratings yet
Sequences and Series PDF
26 pages
Soil Dynamics and Earthquake Engineering: W.D. Liam Finn
No ratings yet
Soil Dynamics and Earthquake Engineering: W.D. Liam Finn
9 pages
Wati Assessment Package
No ratings yet
Wati Assessment Package
22 pages
Test Bank for Global Logistics and Supply Chain Management 3rd Edition - PDF Version Is Available For Instant Access
100% (9)
Test Bank for Global Logistics and Supply Chain Management 3rd Edition - PDF Version Is Available For Instant Access
35 pages
Predictive Model For E-Commerce
100% (1)
Predictive Model For E-Commerce
3 pages
1 s2.0 S131915781730544X Main
No ratings yet
1 s2.0 S131915781730544X Main
7 pages
2022-2023 PET413 Exam Marking Guide
No ratings yet
2022-2023 PET413 Exam Marking Guide
8 pages
Multiple Intelligences
No ratings yet
Multiple Intelligences
28 pages
Analysis of The Gravity Field Direct and Inverse Problems Lecture Notes in Geosystems Mathematics and Computing 3030743551 9783030743550 Compress
No ratings yet
Analysis of The Gravity Field Direct and Inverse Problems Lecture Notes in Geosystems Mathematics and Computing 3030743551 9783030743550 Compress
542 pages
AD3271 DSD Lab Manual
No ratings yet
AD3271 DSD Lab Manual
81 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Machine Learning and Data Mining

Uploaded by

Machine Learning and Data Mining

Uploaded by

DCIT 313

Session 6 – Machine Learning and Data Mining

Course Writer: Dr Kofi Sarpong Adu-Manu

• Understand the learning algorithm trade-offs, balancing performance within

• Differentiate between supervised and unsupervised learning methods as well as

• These uncovered patterns are then used to predict

• The key premise is learning from data!!

• Providing techniques to automate the analysis and

• Tools, methodologies, and theories for revealing patterns

• We want to make the data work for us!!

• The basic premise of learning from data is the use of a

0.0 0.2 0.4 0.6 0.8 1.0

• Different estimates for the target function f that depend on

• Note that will not be a perfect estimate for f; this

• Our aim is to minimize the reducible error!!

• This involves a two-step approach…

• However, these methods can

• The predictor that you have generated using your training

• On the other hand, the simple models do not fit the

• Bias vs. Variance:

• Qualitative: group or categories

• Viewed as the task of spontaneously finding patterns and

• Viewed as a way to create a higher-level representation of

• We need some way to measure the quality of fit (i.e. how

• In regression, we typically use mean squared error (MSE).

• There is no guarantee that the method with the

• However, the test MSE may in fact be higher for a more

• The less flexible the method, the easier to interpret the

• There are always two competing forces that govern the

• Generally, the more flexible/complex a machine learning

• Generally, the more flexible/complex a machine learning

• is an indicator function, which will give 1 if

• In real-life problems, the Bayes error rate can’t be

• The Bayes classifier’s

• For any given X, we find the k closest neighbors to X in the

• If the majority of the Y’s are orange, then we predict orange

Low Bias, High Variance High Bias, Low Variance

• Notice that the KNN

• However, note that the

• However, test errors will

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.