Linear Regression
Linear Regression
Linear Regression
Regression
x ℎ(x) 𝑦 ∈ ℝ
2. Linear Regression - Examples
ℎw (x) = 𝑤0 + 𝑤1 𝑥1 + 𝑤2 𝑥2 + ⋯ + 𝑤𝑑 𝑥𝑑 𝑤0
𝑑
= 𝑤𝑗 𝑥𝑗 = w 𝑇 x , (𝑥0 = 1) 𝑤1
𝑥1
𝑗=0
𝑤2 Σ ℎw (x)
Where: 𝑥2
o 𝑥1 … . 𝑥𝑑 : features of x 𝑤𝑑
o 𝑤0 : bias
o 𝑤1 , … , 𝑤𝑑 : weight w.r.t. to each 𝑥𝑖 𝑥𝑑
o ℎw x : predicted value of x (𝑦)
ො
2. Linear Regression - Hypothesis
w
ෝ = 𝑎𝑟𝑔𝑚𝑖𝑛 𝐽(w)
w
3. Simple Linear Regression
● In simple linear regression, it deals with one dependent variable and one independent
variable.
● It uses a line equation which approximates the relationship between these two
variables, which is learned during the training, to estimate the predicted value.
ℎw (x) = 𝑤0 + 𝑤1 𝑥1
𝑤0
Where: Σ ℎw (x)
o 𝑥1 : feature 𝑥1 of x 𝑤1
o 𝑤0 : bias - interceptor 𝑥1
o 𝑤1 : weight of 𝑥1 - slope
o ℎw x : predicted value of x
3. Simple Linear Regression - Examples
● In simple linear regression, the hypothesis ℎw (x) is a line. The prediction is done
by projecting the observation into that line.
1
𝐽 w = σ𝑛 (ℎw (x𝑖 ) − 𝑦𝑖 )2
2𝑛 𝑖=1
● The goal is to find parameters w
ෝ = (𝑤0 , 𝑤1 ), which minimizes the cost function 𝐽(w) as
much as possible.
w
ෝ = 𝑎𝑟𝑔𝑚𝑖𝑛 𝐽(w)
w
3. Simple Linear Regression
𝜕
𝑤𝑖 = 𝑤𝑖 − 𝜂 𝐽 w
𝜕𝑤𝑖
𝐽(w)
Where 𝜂 is the learning
rate/step
𝑤1
𝑤0
𝜕 1
𝐽 w = σ𝑛𝑖=1(ℎw (x𝑖 ) − 𝑦𝑖 )
𝜕𝑤0 𝑛
1
𝐽 w = σ𝑛𝑖=1 (ℎw (x𝑖 ) − 𝑦𝑖 )2
𝜕 1 2𝑛
𝐽 w = σ𝑛𝑖=1(ℎw (x𝑖 ) − 𝑦𝑖 ) 𝑥𝑖
𝜕𝑤1 𝑛
ℎw (x) = 𝑤0 + 𝑤1 𝑥1
𝜕 1
⟹ 𝐽 w = σ𝑛𝑖=1 ℎw x𝑖 − 𝑦𝑖 𝑥𝑖 , 𝑥0 = 1
𝜕𝑤𝑖 𝑛
4. Gradient Descent - Simple Linear Regression
● The batch gradient descent algorithm is outlined as follow:
𝑚 = 𝑤1 , 𝑏 = 𝑤0
4. Gradient Descent - Simple Linear Regression
𝑚 = 𝑤1 , 𝑏 = 𝑤0
4. Gradient Descent - Simple Linear Regression
𝑚 = 𝑤1 , 𝑏 = 𝑤0
4. Gradient Descent - Simple Linear Regression
𝑚 = 𝑤1 , 𝑏 = 𝑤0
4. Gradient Descent - Simple Linear Regression
𝑚 = 𝑤1 , 𝑏 = 𝑤0
4. Gradient Descent - Effect of Learning Rate
5. Least Squares Method
● Least Squares Method is another method, which is used to learn directly the optimal
parameters 𝑤 ෝ of the linear regression. It is applicable because the cost function is a
convex function consisting of only one minimum value.
● Given 𝑛 examples { x1 , 𝑦1 , x2 , 𝑦2 , . . , x𝑛 , 𝑦𝑛 } such that x ∈ ℝ, the regression line of
simple linear regression is calculated as follows:
ℎw (x) = 𝑤0 + 𝑤1 𝑥1
σ𝑛𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)
ത
Where 𝑤1 =
σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 2
𝑤0 = 𝑦ത − 𝑤1 𝑥ҧ
5. Least Squares Method
Example: Sam is a owner of ice-cream shop. He wants to improve the income by making
ice-creams based on the duration of sunshine. Following the data he collected, estimate
how many ice-creams Sam should prepare when the weather forecast says "we expect 8
hours of sun tomorrow”.
2 4
3 5
5 7
7 10
9 15
5. Least Squares Method
Denote:
𝑥: number of sunshine hours
y: number of ice-creams sold
5. Least Squares Method
Step 1: For each pair (𝑥, 𝑦), calculate 𝑥ҧ and 𝑦ത
𝒙 𝒚
2 4
3 5
5 7
7 10
9 15
𝑥ҧ = 5.2 𝑦ത = 8.2
5. Least Squares Method
Step 2: Calculate 𝑥 − 𝑥,ҧ (𝑥 − 𝑥)ҧ 2 , 𝑦 − 𝑦,
ത (𝑥 − 𝑥)(𝑦
ҧ − 𝑦)
ത
𝒙 𝒚 ഥ
𝒙−𝒙 ഥ
𝒙−𝒙 𝟐 ഥ
𝒚−𝒚 (𝑥 − 𝑥)(𝑦
ҧ − 𝑦)
ത
𝑥ҧ = 5.2 𝑦ത = 8.2
5. Least Squares Method
2
Step 3: Calculate σ(𝑥 − 𝑥)(𝑦
ҧ ത and σ 𝑥𝑖 − 𝑥ҧ
− 𝑦)
𝒙 𝒚 ഥ
𝒙−𝒙 ഥ
𝒙−𝒙 𝟐 ഥ
𝒚−𝒚 (𝑥 − 𝑥)(𝑦
ҧ − 𝑦)
ത
𝑥𝑖 − 𝑥ҧ 2 = 32.8 𝑥 − 𝑥ҧ 𝑦 − 𝑦ത = 49.8
𝑥ҧ = 5.2 𝑦ത = 8.2
5. Least Squares Method
Step 4: Calculate the parameters 𝑤0 and 𝑤1 to obtain regression line.
σ𝑛𝑖=1(𝑥𝑖 − 𝑥)(𝑦
ҧ 𝑖 − 𝑦)ത 49.8
𝑤1 = = = 1.51
σ𝑚𝑖=1 𝑥𝑖 − 𝑥ҧ 2 32.8