0% found this document useful (0 votes)
130 views

Advice For Applying Machine Learning: Deciding What To Try Next

The document provides advice on applying machine learning algorithms and diagnosing issues. It discusses: 1) Evaluating a model on a test set to check for overfitting when the model fails to generalize. 2) Using training, validation, and test sets to select models and avoid overfitting during selection. 3) Diagnosing overfitting (high variance) vs underfitting (high bias) issues using different error rates. 4) How regularization can be used to address high variance by reducing model complexity.

Uploaded by

Sujit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
130 views

Advice For Applying Machine Learning: Deciding What To Try Next

The document provides advice on applying machine learning algorithms and diagnosing issues. It discusses: 1) Evaluating a model on a test set to check for overfitting when the model fails to generalize. 2) Using training, validation, and test sets to select models and avoid overfitting during selection. 3) Diagnosing overfitting (high variance) vs underfitting (high bias) issues using different error rates. 4) How regularization can be used to address high variance by reducing model complexity.

Uploaded by

Sujit Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Advice for applying

machine learning
Deciding what
to try next
Machine Learning
Debugging a learning algorithm:
Suppose you have implemented regularized linear regression to predict housing
prices.

However, when you test your hypothesis on a new set of houses, you find that it
makes unacceptably large errors in its predictions. What should you try next?
- Get more training examples
- Try smaller sets of features
- Try getting additional features
- Try adding polynomial features
- Try decreasing
- Try increasing
Andrew Ng
Machine learning diagnostic:
Diagnostic: A test that you can run to gain insight what
is/isn’t working with a learning algorithm, and gain
guidance as to how best to improve its performance.

Diagnostics can take time to implement, but doing so


can be a very good use of your time.

Andrew Ng
Advice for applying
machine learning
Evaluating a
hypothesis
Machine Learning
Evaluating your hypothesis
Fails to generalize to new
price

examples not in training set.


size of house
no. of bedrooms
size no. of floors
age of house
average income in neighborhood
kitchen size

Andrew Ng
Evaluating your hypothesis
Dataset:
Size Price
2104 400
1600 330
2400 369
1416 232
3000 540
1985 300
1534 315
1427 199
1380 212
1494 243

Andrew Ng
Training/testing procedure for linear regression

- Learn parameter from training data (minimizing


training error )

- Compute test set error:

Andrew Ng
Training/testing procedure for logistic regression
- Learn parameter from training data
- Compute test set error:

- Misclassification error (0/1 misclassification error):

Andrew Ng
Advice for applying
machine learning
Model selection and
training/validation/test
sets
Machine Learning
Overfitting example

Once parameters
price

were fit to some set of data


(training set), the error of the
parameters as measured on
size that data (the training error
xxxxx) is likely to be lower than
the actual generalization error.

Andrew Ng
Model selection
1.
2.
3.

10.
Choose
How well does the model generalize? Report test set
error .
Problem: is likely to be an optimistic estimate of
generalization error. I.e. our extra parameter ( = degree of
polynomial) is fit to test set.
Andrew Ng
Evaluating your hypothesis
Dataset:
Size Price
2104 400
1600 330
2400 369
1416 232
3000 540
1985 300
1534 315
1427 199
1380 212
1494 243

Andrew Ng
Train/validation/test error
Training error:

Cross Validation error:

Test error:

Andrew Ng
Model selection
1.
2.
3.

10.

Pick
Estimate generalization error for test set

Andrew Ng
Advice for applying
machine learning
Diagnosing bias vs.
variance
Machine Learning
Bias/variance
Price

Price

Price
Size Size Size

High bias “Just right” High variance


(underfit) (overfit)

Andrew Ng
Bias/variance
Training error:

Cross validation error:

error

degree of polynomial d
Andrew Ng
Diagnosing bias vs. variance
Suppose your learning algorithm is performing less well than
you were hoping. ( or is high.) Is it a bias
problem or a variance problem?
Bias (underfit):
(cross validation
error

error)

Variance (overfit):
(training error)

degree of polynomial d

Andrew Ng
Advice for applying
machine learning
Regularization and
bias/variance
Machine Learning
Linear regression with regularization
Model:
Price

Price

Price
Size Size Size
Large xx Intermediate xx Small xx
High bias (underfit) “Just right” High variance (overfit)

Andrew Ng
Choosing the regularization parameter

Andrew Ng
Choosing the regularization parameter
Model:

1. Try
2. Try
3. Try
4. Try
5. Try

12. Try
Pick (say) . Test error:
Andrew Ng
Bias/variance as a function of the regularization parameter

Andrew Ng
Advice for applying
machine learning
Learning curves

Machine Learning
Learning curves
error

(training set size)


Andrew Ng
High bias

price
error

size

(training set size)

price
If a learning algorithm is suffering
from high bias, getting more
training data will not (by itself)
help much. size
Andrew Ng
High variance (and small )

price
error

size
(training set size)

price
If a learning algorithm is suffering
from high variance, getting more
training data is likely to help.
size
Andrew Ng
Advice for applying
machine learning
Deciding what to
try next (revisited)
Machine Learning
Debugging a learning algorithm:
Suppose you have implemented regularized linear regression to predict
housing prices. However, when you test your hypothesis in a new set of
houses, you find that it makes unacceptably large errors in its
prediction. What should you try next?

- Get more training examples


- Try smaller sets of features
- Try getting additional features
- Try adding polynomial features
- Try decreasing
- Try increasing

Andrew Ng
Neural networks and overfitting
“Small” neural network “Large” neural network
(fewer parameters; more (more parameters; more prone
prone to underfitting) to overfitting)

Computationally cheaper Computationally more expensive.

Use regularization ( ) to address overfitting.

Andrew Ng

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy