0% found this document useful (0 votes)

12 views

Gradient Descent Algorithm

Algo

Uploaded by

YERRAMSETTY CHAITANYA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Gradient Descent Algorithm

Algo

Uploaded by

YERRAMSETTY CHAITANYA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

import numpy as np

import matplotlib.pyplot as plt

# Generate some sample data

np.random.seed(0) # Ensure reproducibility

X = 2 * np.random.rand(100, 1)

y = 4 + 3 * X + np.random.randn(100, 1)

# Add a column of ones to X to account for the bias term (intercept)

X_b = np.c_[np.ones((100, 1)), X]

# Initialize parameters

theta = np.random.randn(2, 1) # Random initialization

learning_rate = 0.1

n_iterations = 1000

m = len(X_b)

tolerance = 1e-3 # Stopping criterion

# Function to compute the cost (Mean Squared Error)

def compute_cost(X, y, theta):

predictions = X.dot(theta)

cost = (1 / (2 * m)) * np.sum(np.square(predictions - y))

return cost

# Gradient Descent

cost_history = [] # To store the cost at each iteration

selected_steps = [0, 3, 5, 11] # Steps at which to plot the fitting process

plt.figure(figsize=(10, 8))

plt.plot(X, y, "b.", label='Data Points')

for iteration in range(n_iterations):

gradients = 2/m * X_b.T.dot(X_b.dot(theta) - y)

theta = theta - learning_rate * gradients

cost = compute_cost(X_b, y, theta)

cost_history.append(cost)

# Plot the fitting process at selected steps

if iteration in selected_steps:

plt.plot(X, X_b.dot(theta), label=f'Iteration {iteration}', linestyle='--', linewidth=2)

# Check for convergence

if iteration > 0 and abs(cost_history[-2] - cost_history[-1]) < tolerance:

print(f"Converged after {iteration} iterations.")

break

plt.xlabel("X")

plt.ylabel("y")

plt.title("Linear Regression Fitting Process at Selected Steps")

plt.legend()

plt.grid(True) # Added grid for better readability

plt.show()

# Plot the cost function vs. iterations

plt.figure()

plt.plot(range(len(cost_history)), cost_history, 'b-', linewidth=2)

plt.xlabel('Iterations')
plt.ylabel('Cost')

plt.title('Cost Function vs. Iterations')

plt.grid(True) # Added grid for better readability

plt.show()

# Print the final parameters (intercept and slope)

print(f"Intercept: {theta[0][0]}")

print(f"Slope: {theta[1][0]}")

Let’s break down why (𝑋𝑏 ⋅ 𝜃) is part of the gradient calculation in linear regression.

Linear Regression Model:

The linear regression model predicts the output ( 𝑦 ) using the equation:

[𝑦 = 𝜃0 + 𝜃1 ⋅ 𝑥]

In matrix form, this can be written as:

[𝐲 = 𝐗 𝐛 ⋅ 𝜽]

where:

(𝐲) is the vector of predicted values, (𝐗 𝐛 ) is the augmented feature matrix (including
the bias term (intercept term)), (𝜽) is the vector of parameters (including the intercept
and slope).

Cost Function:

The cost function (Mean Squared Error) for linear regression is:
𝑚
1 2
[𝐽(𝜃) = ∑(ℎ𝜃 (𝑥 (𝑖) ) − 𝑦 (𝑖) ) ]
2𝑚
𝑖=1

where:

( 𝑚 ) is the number of training examples, ( ℎ𝜃 (𝑥) ) is the hypothesis (predicted value),

which is (𝐗 𝐛 ⋅ 𝜽).

Gradient of the Cost Function:

To minimize the cost function, we need to compute its gradient with respect to the
parameters (𝜽). The gradient is the vector of partial derivatives of the cost function with
respect to each parameter.
The gradient of the cost function is:
1 𝐓
[∇𝜃 𝐽(𝜃) = 𝐗 (𝐗 𝐛 ⋅ 𝜽 − 𝐲)]
𝑚 𝐛
Breaking Down the Gradient Calculation:

Predicted Values: (𝐗 𝐛 ⋅ 𝜽) gives the predicted values for all training examples. This is
the hypothesis function (ℎ𝜃 (𝑥)).

Error Term:

(𝐗 𝐛 ⋅ 𝜽 − 𝐲) gives the difference between the predicted values and the actual values
(error term).

Gradient Calculation:

(𝐗 𝐛 𝐓 (𝐗 𝐛 ⋅ 𝜽 − 𝐲)) computes the dot product of the transpose of (𝐗 𝐛 ) and the error term.
This gives the sum of the gradients for all training examples.

Averaging and Scaling:

2
(𝑚) scales the gradient by the number of training examples and includes a factor of 2
for the Mean Squared Error derivative.

Complete Gradient Calculation:

2 𝐓
gradients = 𝐗 ⋅ (𝐗 𝐛 ⋅ 𝜽 − 𝐲)
𝑚 𝐛
This line calculates the gradient of the cost function with respect to the parameters (𝜽).
The gradients indicate how much the cost function would change if we adjusted the
parameters slightly. By updating the parameters in the opposite direction of the
gradients, we minimize the cost function.

Summary:

• (𝐗 𝐛 ⋅ 𝜽) represents the predicted values.

• The error term (𝐗 𝐛 ⋅ 𝜽 − 𝐲)is used to calculate how far off the predictions are
from the actual values.
• The gradient calculation uses this error term to determine the direction and
magnitude of the parameter updates needed to minimize the cost function

Python: gradients = 2/m * X_b. T. dot(X_b.dot(theta) - y)

See the code. X_b.dot(theta) represents mx+c where theta is a matrix contains first
column 1 and second column x values. multiplication of X_b* theta calculates mx+c for
every iteration step. We change theta in every step to give new theta.
This line calculates the gradient of the cost function with respect to the 𝑝𝑎𝑟𝑎𝑚𝑒𝑡𝑒𝑟𝑠(𝜽).
The gradients indicate how much the cost function would change if we adjusted the
parameters slightly. By updating the parameters in the opposite direction of the
gradients, we minimize the cost function.

Summary:

• (𝐗 𝐛 ⋅ 𝜽) represents the predicted values.

• The error term (𝐗 𝐛 ⋅ 𝜽 − 𝐲) is used to calculate how far off the predictions are
from the actual values.

• The gradient calculation uses this error term to determine the direction and
magnitude of the parameter updates needed to minimize the cost function.

Derivation

Cost Function:

The cost function (Mean Squared Error) for linear regression is: [𝐽(𝜃) =
1 2
∑𝑚 (ℎ (𝑥 (𝑖) ) − 𝑦 (𝑖) ) ] where:
2𝑚 𝑖=1 𝜃

• (𝑚) is the number of training examples.

• (ℎ𝜃 (𝑥)) is the hypothesis (predicted value), which is (𝑋𝑏 ⋅ 𝜃).

• (𝑦) is the actual value.

Hypothesis Function:

The hypothesis function for linear regression is: [ℎ𝜃 (𝑥) = 𝑋𝑏 ⋅ 𝜃]

Gradient of the Cost Function:

To minimize the cost function, we need to compute its gradient with respect to the
parameters (𝜃). The gradient is the vector of partial derivatives of the cost function with
respect to each parameter.

Step-by-Step Derivation for gradient equation:

1 2
1. Cost Function: [𝐽(𝜃) = 2𝑚 ∑𝑚 (𝑖) (𝑖)
𝑖=1(ℎ𝜃 (𝑥 ) − 𝑦 ) ]

1 (𝑖) 2
2. Substitute Hypothesis Function: [𝐽(𝜃) = 2𝑚 ∑𝑚 (𝑖)
𝑖=1(𝑋𝑏 ⋅ 𝜃 − 𝑦 ) ]

3. Compute Partial Derivative with Respect to (𝜃): We need to compute the

𝜕𝐽(𝜃)
partial derivative of (𝐽(𝜃)) with respect to each parameter (𝜃𝑗 ): [ =
𝜕𝜃𝑗
𝜕 1 2
( ∑𝑚 (𝑋𝑏(𝑖) ⋅ 𝜃 − 𝑦 (𝑖) ) )]
𝜕𝜃𝑗 2𝑚 𝑖=1
𝜕𝐽(𝜃) 1 (𝑖)
4. Apply Chain Rule: Using the chain rule, we get: [ = 𝑚 ∑𝑚 (𝑖)
𝑖=1(𝑋𝑏 ⋅ 𝜃 − 𝑦 ) ⋅
𝜕𝜃𝑗
𝜕 (𝑖)
(𝑋𝑏 ⋅ 𝜃 − 𝑦 (𝑖) )]
𝜕𝜃𝑗

(𝑖)
5. Simplify the Derivative: The derivative of ((𝑋𝑏 ⋅ 𝜃 −
(𝑖) 𝜕𝐽(𝜃) 1 (𝑖)
𝑦 (𝑖) )) 𝑤𝑖𝑡ℎ𝑟𝑒𝑠𝑝𝑒𝑐𝑡𝑡𝑜(𝜃𝑗 )𝑖𝑠(𝑋𝑏 𝑗): [ = 𝑚 ∑ 𝑖 = 1𝑚 (𝑋𝑏 ⋅ 𝜃 − 𝑦 (𝑖) ) ⋅ 𝑋𝑏(𝑖) ]
𝜕𝜃𝑗 𝑗

6. Vectorize the Gradient: We can write the gradient for all parameters(𝜃) in
1
vectorized form: [∇𝜃 𝐽(𝜃) = 𝑚 𝑋𝑏𝑇 (𝑋𝑏 ⋅ 𝜃 − 𝑦)]

Final Gradient Expression:

The gradient of the cost function with respect to the parameters(𝜃) is: [∇𝜃 𝐽(𝜃) =
2
𝑋𝑏𝑇 (𝑋𝑏 ⋅ 𝜃 − 𝑦)]
𝑚

Linear Regression Python Programming
No ratings yet
Linear Regression Python Programming
25 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
Linear+regression+with+one+variable
No ratings yet
Linear+regression+with+one+variable
48 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
CH - En.u4cse19101 Cheduri Linearregression
No ratings yet
CH - En.u4cse19101 Cheduri Linearregression
8 pages
Lecture 2-Linear-Regression-Part1
No ratings yet
Lecture 2-Linear-Regression-Part1
80 pages
Gradient descent
No ratings yet
Gradient descent
16 pages
Regression
No ratings yet
Regression
16 pages
Lecture3_upload
No ratings yet
Lecture3_upload
28 pages
Updating_Weight
No ratings yet
Updating_Weight
9 pages
Experiment N1
No ratings yet
Experiment N1
7 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
7 pages
Lect03 CSN382
No ratings yet
Lect03 CSN382
31 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
25 pages
Introduction To Machine Learning Algorithms: Linear Regression
No ratings yet
Introduction To Machine Learning Algorithms: Linear Regression
1 page
Module3_Ch1
No ratings yet
Module3_Ch1
83 pages
Linear Regression
No ratings yet
Linear Regression
30 pages
Linear Regression
No ratings yet
Linear Regression
30 pages
Lecture 1, Part 1: Linear Regression: Roger Grosse
No ratings yet
Lecture 1, Part 1: Linear Regression: Roger Grosse
9 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
2022 Linear Regression
No ratings yet
2022 Linear Regression
34 pages
CSE_412__Lab_Manual_3___Linear_Regression
No ratings yet
CSE_412__Lab_Manual_3___Linear_Regression
10 pages
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
No ratings yet
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
43 pages
Linear Regression
No ratings yet
Linear Regression
63 pages
(MLP) Lecture Notes
No ratings yet
(MLP) Lecture Notes
22 pages
Foundations of Machine Learning - 3
No ratings yet
Foundations of Machine Learning - 3
38 pages
L02 Linear Regression
No ratings yet
L02 Linear Regression
9 pages
Unit 3.1 Gradient Descent in Linear Regression
No ratings yet
Unit 3.1 Gradient Descent in Linear Regression
6 pages
Linear Regression
No ratings yet
Linear Regression
62 pages
MACHINE LEARNING ALGORITHM Unit-II
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II
115 pages
2. Linear_ Regression_SGD
No ratings yet
2. Linear_ Regression_SGD
71 pages
Lec 3
No ratings yet
Lec 3
22 pages
ML MU Unit 3RegressionTechniquespdf 2025 02-07-10!56!37 (1)
No ratings yet
ML MU Unit 3RegressionTechniquespdf 2025 02-07-10!56!37 (1)
115 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
04 LinearRegression PDF
No ratings yet
04 LinearRegression PDF
61 pages
Gradient Descent and SGD
No ratings yet
Gradient Descent and SGD
8 pages
Linear Regression
No ratings yet
Linear Regression
9 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
12 pages
Unit VI Optimization Techniques question bank solved answer
No ratings yet
Unit VI Optimization Techniques question bank solved answer
20 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
Vertopal.com C1 W1 Lab03 Cost Function Soln
No ratings yet
Vertopal.com C1 W1 Lab03 Cost Function Soln
5 pages
Math YHPLinear Regression
No ratings yet
Math YHPLinear Regression
13 pages
Stanford ML CS229-Merged Notes
No ratings yet
Stanford ML CS229-Merged Notes
126 pages
Linearna Regresija - NG
No ratings yet
Linearna Regresija - NG
7 pages
Machine Learning Notes AndrewNg
No ratings yet
Machine Learning Notes AndrewNg
141 pages
Machine Learning Notes by Standard Andrew Ng
No ratings yet
Machine Learning Notes by Standard Andrew Ng
142 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
293 pages
vertopal.com_22644501_lab02 (4)
No ratings yet
vertopal.com_22644501_lab02 (4)
14 pages
Regression PPT
No ratings yet
Regression PPT
21 pages
IML-Summary
No ratings yet
IML-Summary
12 pages
Machine Learning Theory and Applications - 2024 - Vasques - Machine Learning Alg (1)
No ratings yet
Machine Learning Theory and Applications - 2024 - Vasques - Machine Learning Alg (1)
98 pages
L3 Linear Regression and Gradient Descent
No ratings yet
L3 Linear Regression and Gradient Descent
46 pages
Slide 3 - Linear Regression One Variable
No ratings yet
Slide 3 - Linear Regression One Variable
60 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Regression
No ratings yet
Regression
30 pages
Lec06 Matt[1]
No ratings yet
Lec06 Matt[1]
60 pages
Linear-Regression
No ratings yet
Linear-Regression
55 pages
Question 1 B
No ratings yet
Question 1 B
6 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
3.1 Global-Descent-Based Error Backpropagation: W W Given by
No ratings yet
3.1 Global-Descent-Based Error Backpropagation: W W Given by
28 pages
cs229.... Machine Language. Andrew NG
No ratings yet
cs229.... Machine Language. Andrew NG
17 pages
BookSlides_7 Part A_Error-based_Learning (1)
No ratings yet
BookSlides_7 Part A_Error-based_Learning (1)
60 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Deep Learning Computer Vision NLP
No ratings yet
Deep Learning Computer Vision NLP
140 pages
Kuhun tucker and gradient method
No ratings yet
Kuhun tucker and gradient method
18 pages
Gradient Descent
No ratings yet
Gradient Descent
3 pages
Tutorial Discovery Studio
No ratings yet
Tutorial Discovery Studio
66 pages
Ipcw Ann
No ratings yet
Ipcw Ann
100 pages
Tutorial: The Mechanics of Waveform Inversion: Ian F. Jones
No ratings yet
Tutorial: The Mechanics of Waveform Inversion: Ian F. Jones
13 pages
Summative Assessment
No ratings yet
Summative Assessment
31 pages
Mathematical Foundations for AI Basic
No ratings yet
Mathematical Foundations for AI Basic
3 pages
Download Complete Response surface methodology process and product optimization using designed experiments Fourth Edition Anderson-Cook PDF for All Chapters
100% (2)
Download Complete Response surface methodology process and product optimization using designed experiments Fourth Edition Anderson-Cook PDF for All Chapters
55 pages
1 Lecture 3: Optimization and Linear Regression
No ratings yet
1 Lecture 3: Optimization and Linear Regression
27 pages
LN - Optimization For ML
No ratings yet
LN - Optimization For ML
129 pages
Reinforcement Learning (Part 2) : Nguyen Do Van, PHD
No ratings yet
Reinforcement Learning (Part 2) : Nguyen Do Van, PHD
46 pages
(Ebook) An Introduction to Optimization, Third Edition by Edwin K. P. Chong, Stanislaw H. Zak(auth.) ISBN 9780471758006, 9781118033340, 0471758000, 1118033345 pdf download
100% (1)
(Ebook) An Introduction to Optimization, Third Edition by Edwin K. P. Chong, Stanislaw H. Zak(auth.) ISBN 9780471758006, 9781118033340, 0471758000, 1118033345 pdf download
61 pages
Elastix Manual v4.7
No ratings yet
Elastix Manual v4.7
62 pages
09. Stochastic Gradient Descent 2
No ratings yet
09. Stochastic Gradient Descent 2
42 pages
Geophysical Inverse Theory and Regularization Problems 1st Edition Michael S. Zhdanov (Eds.) instant download
100% (3)
Geophysical Inverse Theory and Regularization Problems 1st Edition Michael S. Zhdanov (Eds.) instant download
71 pages
4.2 - 1b - Gradient Descent - Wikipedia - Workedout
No ratings yet
4.2 - 1b - Gradient Descent - Wikipedia - Workedout
5 pages
DL - Assignment 4 Solution
No ratings yet
DL - Assignment 4 Solution
6 pages
Data Science & ML - A Complete Interview Guide - Dimensionless PDF
100% (1)
Data Science & ML - A Complete Interview Guide - Dimensionless PDF
18 pages
Learning3 6pp
No ratings yet
Learning3 6pp
15 pages
Assignment 3
No ratings yet
Assignment 3
4 pages
A Neural Networks Approach For Portfolio
No ratings yet
A Neural Networks Approach For Portfolio
66 pages
Ml-Exp-4 - Jupyter Notebook
No ratings yet
Ml-Exp-4 - Jupyter Notebook
2 pages
Stochastic Gradient Descent - Math and Python Code
No ratings yet
Stochastic Gradient Descent - Math and Python Code
28 pages
Linear Regression With Multiple Variables
No ratings yet
Linear Regression With Multiple Variables
56 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Gradient Descent Algorithm

Uploaded by

Gradient Descent Algorithm

Uploaded by

import numpy as np

import matplotlib.pyplot as plt

# Generate some sample data

np.random.seed(0) # Ensure reproducibility

# Add a column of ones to X to account for the bias term (intercept)

X_b = np.c_[np.ones((100, 1)), X]

theta = np.random.randn(2, 1) # Random initialization

tolerance = 1e-3 # Stopping criterion

# Function to compute the cost (Mean Squared Error)

def compute_cost(X, y, theta):

cost = (1 / (2 * m)) * np.sum(np.square(predictions - y))

cost_history = [] # To store the cost at each iteration

selected_steps = [0, 3, 5, 11] # Steps at which to plot the fitting process

plt.plot(X, y, "b.", label='Data Points')

for iteration in range(n_iterations):

gradients = 2/m * X_b.T.dot(X_b.dot(theta) - y)

theta = theta - learning_rate * gradients

cost = compute_cost(X_b, y, theta)

# Plot the fitting process at selected steps

plt.plot(X, X_b.dot(theta), label=f'Iteration {iteration}', linestyle='--', linewidth=2)

# Check for convergence

if iteration > 0 and abs(cost_history[-2] - cost_history[-1]) < tolerance:

print(f"Converged after {iteration} iterations.")

plt.title("Linear Regression Fitting Process at Selected Steps")

plt.grid(True) # Added grid for better readability

# Plot the cost function vs. iterations

plt.plot(range(len(cost_history)), cost_history, 'b-', linewidth=2)

plt.title('Cost Function vs. Iterations')

plt.grid(True) # Added grid for better readability

# Print the final parameters (intercept and slope)

Linear Regression Model:

In matrix form, this can be written as:

( 𝑚 ) is the number of training examples, ( ℎ𝜃 (𝑥) ) is the hypothesis (predicted value),

Gradient of the Cost Function:

Averaging and Scaling:

Complete Gradient Calculation:

• (𝐗 𝐛 ⋅ 𝜽) represents the predicted values.

Python: gradients = 2/m * X_b. T. dot(X_b.dot(theta) - y)

• (𝐗 𝐛 ⋅ 𝜽) represents the predicted values.

• (𝑚) is the number of training examples.

• (ℎ𝜃 (𝑥)) is the hypothesis (predicted value), which is (𝑋𝑏 ⋅ 𝜃).

• (𝑦) is the actual value.

The hypothesis function for linear regression is: [ℎ𝜃 (𝑥) = 𝑋𝑏 ⋅ 𝜃]

Gradient of the Cost Function:

Step-by-Step Derivation for gradient equation:

3. Compute Partial Derivative with Respect to (𝜃): We need to compute the

Final Gradient Expression:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.