0% found this document useful (0 votes)

130 views

Advice For Applying Machine Learning: Deciding What To Try Next

The document provides advice on applying machine learning algorithms and diagnosing issues. It discusses: 1) Evaluating a model on a test set to check for overfitting when the model fails to generalize. 2) Using training, validation, and test sets to select models and avoid overfitting during selection. 3) Diagnosing overfitting (high variance) vs underfitting (high bias) issues using different error rates. 4) How regularization can be used to address high variance by reducing model complexity.

Uploaded by

Sujit Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

130 views

Advice For Applying Machine Learning: Deciding What To Try Next

Uploaded by

Sujit Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 30

Advice for applying

machine learning
Deciding what
to try next
Machine Learning
Debugging a learning algorithm:
Suppose you have implemented regularized linear regression to predict housing
prices.

However, when you test your hypothesis on a new set of houses, you find that it
makes unacceptably large errors in its predictions. What should you try next?
- Get more training examples
- Try smaller sets of features
- Try getting additional features
- Try adding polynomial features
- Try decreasing
- Try increasing
Andrew Ng
Machine learning diagnostic:
Diagnostic: A test that you can run to gain insight what
is/isn’t working with a learning algorithm, and gain
guidance as to how best to improve its performance.

Diagnostics can take time to implement, but doing so

can be a very good use of your time.

Andrew Ng
Advice for applying
machine learning
Evaluating a
hypothesis
Machine Learning
Evaluating your hypothesis
Fails to generalize to new
price

examples not in training set.

size of house
no. of bedrooms
size no. of floors
age of house
average income in neighborhood
kitchen size

Andrew Ng
Evaluating your hypothesis
Dataset:
Size Price
2104 400
1600 330
2400 369
1416 232
3000 540
1985 300
1534 315
1427 199
1380 212
1494 243

Andrew Ng
Training/testing procedure for linear regression

- Learn parameter from training data (minimizing

training error )

- Compute test set error:

Andrew Ng
Training/testing procedure for logistic regression
- Learn parameter from training data
- Compute test set error:

- Misclassification error (0/1 misclassification error):

Andrew Ng
Advice for applying
machine learning
Model selection and
training/validation/test
sets
Machine Learning
Overfitting example

Once parameters
price

were fit to some set of data

(training set), the error of the
parameters as measured on
size that data (the training error
xxxxx) is likely to be lower than
the actual generalization error.

Andrew Ng
Model selection
1.
2.
3.

10.
Choose
How well does the model generalize? Report test set
error .
Problem: is likely to be an optimistic estimate of
generalization error. I.e. our extra parameter ( = degree of
polynomial) is fit to test set.
Andrew Ng
Evaluating your hypothesis
Dataset:
Size Price
2104 400
1600 330
2400 369
1416 232
3000 540
1985 300
1534 315
1427 199
1380 212
1494 243

Andrew Ng
Train/validation/test error
Training error:

Cross Validation error:

Test error:

Andrew Ng
Model selection
1.
2.
3.

10.

Pick
Estimate generalization error for test set

Andrew Ng
Advice for applying
machine learning
Diagnosing bias vs.
variance
Machine Learning
Bias/variance
Price

Price

Price
Size Size Size

High bias “Just right” High variance

(underfit) (overfit)

Andrew Ng
Bias/variance
Training error:

Cross validation error:

error

degree of polynomial d
Andrew Ng
Diagnosing bias vs. variance
Suppose your learning algorithm is performing less well than
you were hoping. ( or is high.) Is it a bias
problem or a variance problem?
Bias (underfit):
(cross validation
error

error)

Variance (overfit):
(training error)

degree of polynomial d

Andrew Ng
Advice for applying
machine learning
Regularization and
bias/variance
Machine Learning
Linear regression with regularization
Model:
Price

Price

Price
Size Size Size
Large xx Intermediate xx Small xx
High bias (underfit) “Just right” High variance (overfit)

Andrew Ng
Choosing the regularization parameter

Andrew Ng
Choosing the regularization parameter
Model:

1. Try
2. Try
3. Try
4. Try
5. Try

12. Try
Pick (say) . Test error:
Andrew Ng
Bias/variance as a function of the regularization parameter

Andrew Ng
Advice for applying
machine learning
Learning curves

Machine Learning
Learning curves
error

(training set size)

Andrew Ng
High bias

price
error

size

(training set size)

price
If a learning algorithm is suffering
from high bias, getting more
training data will not (by itself)
help much. size
Andrew Ng
High variance (and small )

price
error

size
(training set size)

price
If a learning algorithm is suffering
from high variance, getting more
training data is likely to help.
size
Andrew Ng
Advice for applying
machine learning
Deciding what to
try next (revisited)
Machine Learning
Debugging a learning algorithm:
Suppose you have implemented regularized linear regression to predict
housing prices. However, when you test your hypothesis in a new set of
houses, you find that it makes unacceptably large errors in its
prediction. What should you try next?

- Get more training examples

- Try smaller sets of features
- Try getting additional features
- Try adding polynomial features
- Try decreasing
- Try increasing

Andrew Ng
Neural networks and overfitting
“Small” neural network “Large” neural network
(fewer parameters; more (more parameters; more prone
prone to underfitting) to overfitting)

Computationally cheaper Computationally more expensive.

Use regularization ( ) to address overfitting.

Andrew Ng

Final - DNN - Hands - On - Jupyter Notebook
25% (8)
Final - DNN - Hands - On - Jupyter Notebook
8 pages
Scan To BIM - Presentation
No ratings yet
Scan To BIM - Presentation
61 pages
SVM Using Python
No ratings yet
SVM Using Python
24 pages
ML Interview Cheat Sheet
No ratings yet
ML Interview Cheat Sheet
9 pages
Adaptive Neural Fuzzy Inference Systems (ANFIS) : Analysis and Applications
No ratings yet
Adaptive Neural Fuzzy Inference Systems (ANFIS) : Analysis and Applications
29 pages
Using Supervised Learning To Predict English Premier League Match
No ratings yet
Using Supervised Learning To Predict English Premier League Match
79 pages
Slide 7 - Neural Networks
No ratings yet
Slide 7 - Neural Networks
64 pages
Slide 2 - Data Preprocessing
100% (1)
Slide 2 - Data Preprocessing
39 pages
2.2 ML Session Bias Variance Tradeoffs
No ratings yet
2.2 ML Session Bias Variance Tradeoffs
38 pages
Feature Engg Pre Processing Python
No ratings yet
Feature Engg Pre Processing Python
68 pages
Candidate Elimination Algorithm
No ratings yet
Candidate Elimination Algorithm
24 pages
Machine Learning Part 8
No ratings yet
Machine Learning Part 8
42 pages
Machine Learning Guide Line
No ratings yet
Machine Learning Guide Line
10 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Poly
100% (1)
Poly
108 pages
Machine Learning C
No ratings yet
Machine Learning C
24 pages
ML 2
No ratings yet
ML 2
6 pages
12 Outlier
No ratings yet
12 Outlier
55 pages
Module 4 - Confusion Matrix-1
No ratings yet
Module 4 - Confusion Matrix-1
18 pages
71A Machine Learning
No ratings yet
71A Machine Learning
8 pages
Get Feature Engineering Bookcamp 1st Edition Sinan Ozdemir free all chapters
100% (2)
Get Feature Engineering Bookcamp 1st Edition Sinan Ozdemir free all chapters
55 pages
Convolutional Neural Networks For Visual Recognition
No ratings yet
Convolutional Neural Networks For Visual Recognition
45 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
7 pages
Predict 422 - Module 8
100% (1)
Predict 422 - Module 8
138 pages
ML Unit-1
No ratings yet
ML Unit-1
26 pages
K-Means Clustering Using Python
No ratings yet
K-Means Clustering Using Python
30 pages
Core Java
No ratings yet
Core Java
217 pages
Evaluation Metrics in Machine Learning
No ratings yet
Evaluation Metrics in Machine Learning
14 pages
HW1
100% (1)
HW1
8 pages
ML - Unit 2
No ratings yet
ML - Unit 2
15 pages
Slide 3 - Linear Regression One Variable
No ratings yet
Slide 3 - Linear Regression One Variable
60 pages
Simple Linear Regression - Assign3
No ratings yet
Simple Linear Regression - Assign3
8 pages
Quiz Week 7 - Support Vector Machines
100% (1)
Quiz Week 7 - Support Vector Machines
3 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
Breast Cancer Classification
No ratings yet
Breast Cancer Classification
18 pages
Data Mining:: Concepts and Techniques
100% (1)
Data Mining:: Concepts and Techniques
63 pages
Lecture 9 PDF
100% (1)
Lecture 9 PDF
28 pages
Machine Learning Models: by Mayuri Bhandari
No ratings yet
Machine Learning Models: by Mayuri Bhandari
48 pages
Module2.3 Hyperparameter Optimization
No ratings yet
Module2.3 Hyperparameter Optimization
29 pages
ML Unit 2
No ratings yet
ML Unit 2
90 pages
Understanding Support Vector Machine Algorithm From Examples
No ratings yet
Understanding Support Vector Machine Algorithm From Examples
10 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Data Science Presentation
100% (3)
Data Science Presentation
113 pages
Statistics Interview Questions & Answers For Data Scientists
No ratings yet
Statistics Interview Questions & Answers For Data Scientists
43 pages
CS2055 - Software Quality Assurance
No ratings yet
CS2055 - Software Quality Assurance
15 pages
ML UNIT-IV Notes
100% (1)
ML UNIT-IV Notes
23 pages
Machine Learning Bits
100% (2)
Machine Learning Bits
28 pages
Full Download Multivariate Statistical Methods A Primer Third Edition Manly PDF DOCX
100% (10)
Full Download Multivariate Statistical Methods A Primer Third Edition Manly PDF DOCX
65 pages
Bias and Variance
No ratings yet
Bias and Variance
6 pages
DBSCAN
No ratings yet
DBSCAN
18 pages
Recommender System
No ratings yet
Recommender System
45 pages
1 - Machine Learning (Start)
No ratings yet
1 - Machine Learning (Start)
32 pages
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
100% (1)
Parallelism of Statistics and Machine Learning & Logistic Regression Versus Random Forest
72 pages
Machine Learning and Neural Networks: Riccardo Rizzo
100% (1)
Machine Learning and Neural Networks: Riccardo Rizzo
113 pages
Andrea Martorana Tusa: Failure Prediction For Manufacturing Industry
No ratings yet
Andrea Martorana Tusa: Failure Prediction For Manufacturing Industry
23 pages
ML Question Bank
No ratings yet
ML Question Bank
29 pages
The Problem of Overfitting: Overfitting With Linear Regression
No ratings yet
The Problem of Overfitting: Overfitting With Linear Regression
32 pages
Use Case Diagram
No ratings yet
Use Case Diagram
42 pages
Evaluation Metrics For Regression: Dr. Jasmeet Singh Assistant Professor, Csed Tiet, Patiala
No ratings yet
Evaluation Metrics For Regression: Dr. Jasmeet Singh Assistant Professor, Csed Tiet, Patiala
13 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Lec-4-HEURISTIC SEARCH METHODS-1
No ratings yet
Lec-4-HEURISTIC SEARCH METHODS-1
54 pages
Data Preprocessing
No ratings yet
Data Preprocessing
77 pages
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
From Everand
The Datadog Handbook: A Guide to Monitoring, Metrics, and Tracing
Robert Johnson
No ratings yet
How To Solve Setup & Hold Violations in The Design: Power
100% (2)
How To Solve Setup & Hold Violations in The Design: Power
20 pages
Routability Bump Driven
No ratings yet
Routability Bump Driven
34 pages
PD Evaluation Test 1
No ratings yet
PD Evaluation Test 1
2 pages
1.estimate The Pararasitics of A Net Whose Fanout Is 7.: Page Sta Evaluation Test 1
100% (1)
1.estimate The Pararasitics of A Net Whose Fanout Is 7.: Page Sta Evaluation Test 1
11 pages
Machine Learning Part 1
No ratings yet
Machine Learning Part 1
27 pages
PD Evaluation Test
No ratings yet
PD Evaluation Test
2 pages
PD-Eval Test 28dec
No ratings yet
PD-Eval Test 28dec
7 pages
09 Placement
100% (2)
09 Placement
48 pages
CHAPTER 1: AUTOSAR Fundamentals Topics Covered
No ratings yet
CHAPTER 1: AUTOSAR Fundamentals Topics Covered
14 pages
Assignment of CTS Experiment 2
No ratings yet
Assignment of CTS Experiment 2
6 pages
Machine Learning Part 9
No ratings yet
Machine Learning Part 9
33 pages
Automotive Communication Protocols - New
100% (2)
Automotive Communication Protocols - New
85 pages
Cts Opt TCL
No ratings yet
Cts Opt TCL
3 pages
CMOS Basic
No ratings yet
CMOS Basic
11 pages
Cadence Datasheet
No ratings yet
Cadence Datasheet
1,327 pages
11
No ratings yet
11
5 pages
Assignment MET1233
No ratings yet
Assignment MET1233
12 pages
ADBMS Sample Ques
No ratings yet
ADBMS Sample Ques
12 pages
Neural Network For PLC PDF
No ratings yet
Neural Network For PLC PDF
7 pages
Week 4 Graded Assignment
No ratings yet
Week 4 Graded Assignment
3 pages
What Is Hyperparameter Tuning
No ratings yet
What Is Hyperparameter Tuning
2 pages
Integrating Deep Reinforcement Learning With Model-Based Path Planner
No ratings yet
Integrating Deep Reinforcement Learning With Model-Based Path Planner
6 pages
LongChat-13B: An Open-Source Chatbot With 16k Tokens Memory
No ratings yet
LongChat-13B: An Open-Source Chatbot With 16k Tokens Memory
6 pages
Instrument HOOK - UP Drawing Basics - Industrial Automation - Industrial Automation, PLC Programming, Scada & Pid Control System
100% (1)
Instrument HOOK - UP Drawing Basics - Industrial Automation - Industrial Automation, PLC Programming, Scada & Pid Control System
2 pages
Pre-Test Name: Date: Year & Section: Score:: Communication
100% (1)
Pre-Test Name: Date: Year & Section: Score:: Communication
2 pages
CS610 - Artificial Intelligence and Machine Learning
No ratings yet
CS610 - Artificial Intelligence and Machine Learning
1 page
H13-211-EnU HCIA-Intelligent Computing V1.0 Dumps
No ratings yet
H13-211-EnU HCIA-Intelligent Computing V1.0 Dumps
11 pages
A3 - 1bm15me039 - Nyquist Plot Using Matlab
No ratings yet
A3 - 1bm15me039 - Nyquist Plot Using Matlab
12 pages
Maktab Full Essay
No ratings yet
Maktab Full Essay
2 pages
Data Analyst Curriculum Refocus
No ratings yet
Data Analyst Curriculum Refocus
1 page
Deep Image Search For Similar Image Using ML
No ratings yet
Deep Image Search For Similar Image Using ML
13 pages
Lab Manual Experiment 7
No ratings yet
Lab Manual Experiment 7
6 pages
AITools Unit-4
No ratings yet
AITools Unit-4
25 pages
B.Tech - CS - Design 3rd Year Year 2023-24
No ratings yet
B.Tech - CS - Design 3rd Year Year 2023-24
33 pages
Artificial Neural Network: Vardhaman College of Engineering
No ratings yet
Artificial Neural Network: Vardhaman College of Engineering
35 pages
DBMS
No ratings yet
DBMS
16 pages
Artificial Intelligence:: Lecturer: A.B. Mutiara
No ratings yet
Artificial Intelligence:: Lecturer: A.B. Mutiara
32 pages
Aida Model
No ratings yet
Aida Model
2 pages
Compute Torque Control
No ratings yet
Compute Torque Control
12 pages
Systems
No ratings yet
Systems
4 pages
C
No ratings yet
C
2 pages
P, PI and PID Controllers - A Comparative Study
No ratings yet
P, PI and PID Controllers - A Comparative Study
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Advice For Applying Machine Learning: Deciding What To Try Next

Uploaded by

Advice For Applying Machine Learning: Deciding What To Try Next

Uploaded by

Advice for applying

Diagnostics can take time to implement, but doing so

examples not in training set.

- Learn parameter from training data (minimizing

- Compute test set error:

- Misclassification error (0/1 misclassification error):

were fit to some set of data

Cross Validation error:

High bias “Just right” High variance

Cross validation error:

(training set size)

(training set size)

- Get more training examples

Computationally cheaper Computationally more expensive.

Use regularization ( ) to address overfitting.

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.