ML 02 Linear Regression
ML 02 Linear Regression
Machine Learning
Linear Regression
h: hypothesis function
How to represent hypothesis h?
θi are parameters
- θ0 is zero condition
- θ1 is gradient
θ: vector of all the parameters
θi are parameters
- θ0 is zero condition
- θ1 is gradient
Parameters:
Cost Function:
Goal:
(for fixed , this is a function of x) (function of the parameters )
500
400
Price ($)
in 1000’s 300
200
100
0
0 1000 2000 3000
Size in feet (x)
2
Want
• Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum
θ1
θ0
If the function has multiple local minima, where one starts
can decide which minimum is reached
J(θ0,θ1)
θ1
θ0
Gradient descent algorithm
update
and
simultaneously
“Batch” Gradient Descent
2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… … … … …
Multiple features (variables).
2104 5 1 45 460
1416 3 2 40 232
1534 3 2 30 315
852 2 1 36 178
… … … … …
Notation:
= number of features. m = number of training examples
= input (features) of training example.
= value of feature in training example.
Hypothesis:
For univariate linear regression:
Parameters:
Cost function:
Gradient descent:
Repeat
Repeat
(simultaneously update )
New algorithm :
Gradient Descent
Repeat
Previously (n=1):
Repeat
(simultaneously update )
Practical aspects of applying
gradient descent
Feature Scaling
Idea: Make sure features are on a similar scale.
size (feet2)
number of bedrooms
Feature Scaling
Idea: Make sure features are on a similar scale.
Mean normalization:
Replace with to make features have approximately zero
mean (Do not apply to ).
Price
(y)
Size (x)