0% found this document useful (0 votes)
2 views

10.Introduction to Artificial Intelligence

The document provides an overview of linear regression, a statistical method used to determine the relationship between input and output variables, where output values are continuous. It explains concepts such as simple and multivariate linear regression, error measurement, and the gradient descent optimization algorithm for improving prediction accuracy. Additionally, it highlights the importance of finding the best-fit line to minimize errors in predictions.

Uploaded by

rmj92623
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

10.Introduction to Artificial Intelligence

The document provides an overview of linear regression, a statistical method used to determine the relationship between input and output variables, where output values are continuous. It explains concepts such as simple and multivariate linear regression, error measurement, and the gradient descent optimization algorithm for improving prediction accuracy. Additionally, it highlights the importance of finding the best-fit line to minimize errors in predictions.

Uploaded by

rmj92623
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Introduction to the Artificial Intelligent

Supervised Learning - Linear Regression


Regression
Linear Regression
Linear Regression and Deterministic
Relationship
Linear Regression Function
Error Measurement
Simple Linear Regression
Multivariate Linear Regression
Regression

• Regression: is a statistical measure


that aims to determine the
relationship between the input and

House Price
output variables, the output value is
continuous numbers.
• In Regression the output values are
continuous values within a specified
range.
House Space, Square Feet
Regression

• In regression, it is assumed that the


output variables depend on the input
variables.

House Price
• The input variables are called
independent variables, also known as
explanatory variables.
• The output variables are called
dependent variables, also known as
response or outcome variables House Space, Square Feet
Regression

• The relationship between inputs and


outputs can be represented linearly
or polynomial regression. Depending Linearly
on the problem, we use different
forms of regression to represent the
relationship.
• Example using Regression
1. Forecast or predict outcome values.
2. Predicting prices, economics, etc. Polynomial
Regression
3. Used to analyze the correlation
between variables.
Linear Regression

• Linear Regression: is one of the machine


learning algorithms that aims to find a
linear relationship between the value of

Y= dependent variable
the outcomes and the inputs to predict
the values of the results.
• The relationship can be represented by
The straight-line function :
𝒚 = 𝒙𝒘 + 𝒃 w= Slope
• y: It is the dependent variable and is
called the response or outcome variable
• X: It is the independent variable or the
explanatory variable b=Intercept x= independent variable
• b: Is the segment or intersection point
with Y (y-intercept)
• w: is the slope in the straight-line
function.
Linear Regression and Straight-Line Function

The slope value (w) is equal

The segment value (b)is equal


Linear Regression

• Linear equation can be represented


graphically by using the scatter plot.
• Line of Best fit characteristic:
1. Choose the equation that minimizes
the difference between the
expected/predicted result and the
actual result. The expected result value

2. It is the line that passes through


most of the data elements in the Actual Result Value
training set.
Linear Regression and Deterministic Relationship

• The linear regression is considered


fast and effective in many problems.

Y= dependent variable
• The example in the graph represents
a deterministic relationship that can
be represented by
w= Slope
𝑦 = 0.5𝑥 + 1.
• The deterministic relationship is a
rare but is more common in the fields b=Intercept x= independent variable
of business administration and
sociology.
Example: Assets = Liabilities + Equity
Linear Regression

• Most of the problems in which the Study Hours Exam Score


relationship between the inputs or
1 0.76
explanatory variables and the results is
non-deterministic. 1 0.78
• Therefore, we must estimate the value of
the coefficients of the linear function 3 0.82
best suited to the data (line of Best Fit). 3 0.80
• For example, we assume that we have
data on the number of daily study hours 4 0.86
for an exam and the exam score.
5 0.88
• We note here that there is a positive
correlation between study hours and 5 0.90
exam scores. The more hours of daily
study, the higher the exam score. 6 0.94

7 0.96
Linear Regression

• If we assume that there is a linear


relationship between the result (exam
score) and the explanatory variable
(study hours), this can be represented
using the straight-line function, which
takes the following form:
𝑦ො 𝑥 = 𝑥𝑤 + 𝑏
•𝒚ෝ is the estimated value of a result
that depends on the explanatory
variable 𝑥.
Simple Linear Regression

In simple linear regression there are:


• Only one independent variable x
• The relationship between x and y is
described by a linear function.
Actual Result
• Changes in y are assumed to result
from changes in x. Estimated result value Slope

• The linear regression function can be


used to predict or estimate the value Y-Intercept
of the result given the explanatory
variable x.
Calculation of Least Squares Line Coefficients
Simple Linear Regression

• Inputs: 𝑋 = {𝑥1, 𝑥2, … , 𝑥𝑚 } outputs: 𝑦 = {𝑦1, 𝑦2, … , 𝑦𝑚 }


• The goal: Find linear function values that fit the given data.
• The values for b and w are obtained by finding b and w that
minimize the sum of the square error between 𝑦 and 𝑦. ො
• To do this we calculate the partial derivatives of the function
(SSE) then solve the equation to get least squares estimates for
b and w.
Simple Linear Regression and Coefficient Functions

• The least squares estimate of slope coefficient w and segment b is:


∑ 𝑥 − 𝑥ҧ (𝑦 − 𝑦)ത 𝑛∑ 𝑥𝑦 − ∑𝑥 ∑𝑦
𝑤= 2
=
∑ 𝑥 − 𝑥ҧ 𝑛∑𝑥2 − ∑𝑥 2

∑𝑦 𝑤∑𝑥
𝑏 = 𝑦ത − 𝑤𝑥ҧ = −
𝑛 𝑛
Where 𝑦ത and 𝑥ҧ are the average values of both 𝑦 and 𝑥.
Exercise: Simple Linear Regression

• Returning to the previous example


of estimating least squares line
Study Hours Exam Score
values:

𝑛∑ 𝑥𝑦 − ∑𝑥 ∑𝑦
𝑤=
𝑛∑𝑥2 − ∑𝑥 2

9×31.1−35×7.7 10.4
𝑤= = = 0.03
9×171−(35)2 314

7.7 35
𝑏= − 0.03 × = 0.726
9 9
Measuring Error

• loss function is the sum of the difference


between the expected results and the
actual results is called, cost function, or
residual.
Loss function = (actual result - expected result)2
• Returning to the previous example,
suppose we want to calculate the error in
estimating value of the exam score (𝑦),
ො if
the number of study hours is 6 hours.
𝑦ො 𝑥 = 0.025 × 6 + 0.73 = 0.88

Actual result= 𝑦 = 0.94


𝑦ො 𝑥 − 𝑦 2 = 0.94 − 0.88 2 = 0.0036
Measuring Error

• The two most used functions for


measuring error rate are:
Sum Squared Error:
𝑆𝑆𝐸 = ∑𝑀 ෝ𝑖 )2 = 𝑦1 − 𝑦ෞ1 2 + ⋯+ 2
𝑖=1 (𝑦𝑖 − 𝑦 𝑦𝑚 − 𝑦ෞ𝑚
Multivariate Linear Regression

• Multivariate regression: is an extension of


simple linear regression that is used when
we want to predict the value of a variable
based on the values of two or more other
variables.
• The example here shows sales numbers 𝑦
and the duration of television ads 𝑥1 and
radio 𝑥2. The data represents the effect of
the length of duration (in minutes) of
television and radio ads on the sales value
of a product.
• We notice here that the longer the
advertisement runs on both television and
radio stations, the greater the number of
sales
Linear Regression Equation

• A linear equation can be multivariable:

𝒙 is explanatory variables.
ෝ is response/ outcome variables.
𝒚
𝒘 is Coefficients or weights of explanatory variables.
Multivariate Linear Regression

• As shown here in the three-dimensional


graph, we find that the relationship can be
represented linearly using a plane surface
that describes the data with the highest
possible accuracy to facilitate the process
of predicting the number of future sales
based on the duration of advertisements.
• The surface equation takes the following
form:
𝑦ො 𝑥 = 𝑤1𝑥1 + 𝑤2𝑥2 + 𝑏
• 𝒙𝟏 is the duration of television ads.
• 𝒙2 is the duration of radio ads.
•𝒚ෝ is the estimated value of the sales
numbers.
Gradient Descent

• To reach the optimal plane surface


that gives the best predict values of
sales, iterative techniques are used to
search for coefficient values/weights
(𝑤1 ,𝑤2 ,𝑏) that reduce the cost function
(sum or mean squared error).
Gradient Descent

• The gradient descent algorithm is one of


the optimization algorithms that are used
in many machine learning models to
reduce the value of the cost function and
improve prediction accuracy.
• From the graph, each point (𝑤1, ,𝑤2 ) has a
specific cost value. The lowest cost value
is located in the middle of the figure
indicated by the red circles (contour).
• The algorithm begins by assigning random
values to all coefficients and then moves
gradually to reach the values of
coefficients with the lowest cost values.
Thank You

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy