Regression
Regression
(s) (predictor).
variable).
Let’s say, you want to estimate growth in sales of a company based on current
economic conditions. You have the recent company data which indicates that the
growth in sales is around two and a half times the growth in the economy. Using this
insight, we can predict future sales of the company based on current & past
information.
3.1.1 Linear Regression
Linear Regression establishes a relationship between dependent variable
(Y) and one or more independent variables (X) using a best fit straight
line (also known as regression line).
The equation for simple linear regression is:y=mx+b
Y is the dependent variable
X is the independent variable
m is the slope of the line
b is the intercept
3.1.2 Multiple Linear Regression
This involves more than one independent variable and one dependent
variable. The equation for multiple linear regression is:
y=β0+β1X1+β2X2+………βnXny=β0+β1X1+β2X2+………βnXn
where:
Y is the dependent variable
X1, X2, …, Xn are the independent variables
β0 is the intercept
β1, β2, …, βn are the slopes
3.1.2.1 The goal of the algorithm is to find the best Fit Line equation that can predict
the values based on the independent variables.
In regression set of records are present with X and Y values and these
values are used to learn a function so if you want to predict Y from an
unknown X this learned function can be used. In regression we have to find
the value of Y, So, a function is required that predicts continuous Y in the
case of regression given X as independent features.
4 What is the best Fit Line?
Our primary objective while using linear regression is to locate the best-fit
line, which implies that the error between the predicted and actual values
should be kept to a minimum. There will be the least error in the best-fit line.
The best Fit Line equation provides a straight line that represents the
relationship between the dependent and independent variables. The slope of
the line indicates how much the dependent variable changes for a unit
change in the independent variable(s).
Linear Regression
types
https://www.analyticsvidhya.com/blog/2021/06/linear-regression-in-machine-learning/
https://www.analyticsvidhya.com/blog/2021/06/linear-regression-in-machine-learning/
The model gets the best regression fit line by finding the best θ 1 and
θ2 values.
θ1: intercept
θ2: coefficient of x
Once we find the best θ1 and θ2 values, we get the best-fit line. So when we
are finally using our model for prediction, it will predict the value of y for the
input value of x.
4.1.1 How to update θ1 and θ2 values to get the best-fit line?
To achieve the best-fit regression line, the model aims to predict the target
value Y^ Y^ such that the error difference between the predicted
value Y^ Y^ and the true value Y is minimum. So, it is very important to
update the θ1 and θ2 values, to reach the best value that minimizes the error
between the predicted y value (pred) and the true y value (y).
minimize1n∑i=1n(yi^−yi)2minimizen1∑i=1n(yi^−yi)2
4.2 COST FUNCTION FOR LINEAR REGRESSION
The cost function or the loss function is nothing but the error or difference
between the predicted value Y^ Y^ and the true value Y.
In Linear Regression, the Mean Squared Error (MSE) cost function is
employed, which calculates the average of the squared errors between the
predicted values y^iy^i and the actual values yiyi. The purpose is to
determine the optimal values for the intercept θ1θ1 and the coefficient of the
input feature θ2θ2 providing the best-fit line for the given data points. The
linear equation expressing this relationship is y^i=θ1+θ2xiy^i=θ1+θ2xi.
MSE function can be calculated as:
Cost function(J)=1n∑ni(yi^−yi)2 Cost function(J)=n1∑ni(yi^−yi)2