0% found this document useful (0 votes)
85 views

Logistic Regression

Logistic regression is a machine learning classification algorithm that predicts categorical dependent variables by calculating data probabilities. It fits an S-shaped logistic function to classify observations, predicting values between 0 and 1. Unlike linear regression, which is used for regression problems, logistic regression is used for classification problems where the dependent variable is categorical. It has applications in banking, weather prediction, ecommerce, and other areas involving binary or multiclass classification.

Uploaded by

Rahul sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views

Logistic Regression

Logistic regression is a machine learning classification algorithm that predicts categorical dependent variables by calculating data probabilities. It fits an S-shaped logistic function to classify observations, predicting values between 0 and 1. Unlike linear regression, which is used for regression problems, logistic regression is used for classification problems where the dependent variable is categorical. It has applications in banking, weather prediction, ecommerce, and other areas involving binary or multiclass classification.

Uploaded by

Rahul sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Logistic Regression

o Logistic regression is one of the most popular Machine Learning algorithms, which
comes under the Supervised Learning technique. It is used for predicting the
categorical dependent variable using a given set of independent variables.
o Logistic regression predicts the output of a categorical dependent variable. Therefore
the outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1,
true or False, etc. but instead of giving the exact value as 0 and 1, it gives the
probabilistic values which lie between 0 and 1.
o Logistic Regression is much similar to the Linear Regression except that how they are
used. Linear Regression is used for solving Regression problems, whereas Logistic
regression is used for solving the classification problems.
o In Logistic regression, instead of fitting a regression line, we fit an "S" shaped logistic
function, which predicts two maximum values (0 or 1).
o Logistic Regression is a significant machine learning algorithm because it has the ability
to provide probabilities and classify new data using continuous and discrete datasets.
o Logistic Regression can be used to classify the observations using different types of
data and can easily determine the most effective variables used for the classification.
The below image is showing the logistic function:
 Logistic Function (Sigmoid Function):

o The sigmoid function is a mathematical function used to map the predicted values to
probabilities.
o It maps any real value into another value within a range of 0 and 1.
o The value of the logistic regression must be between 0 and 1, which cannot go beyond
this limit, so it forms a curve like the "S" form. The S-form curve is called the Sigmoid
function or the logistic function.
o In logistic regression, we use the concept of the threshold value, which defines the
probability of either 0 or 1. Such as values above the threshold value tends to 1, and
a value below the threshold values tends to 0.

 Logistic Regression Equation:

The Logistic regression equation can be obtained from the Linear Regression equation. The
mathematical steps to get Logistic Regression equations are given below:

o We know the equation of the straight line can be written as:

o In Logistic Regression y can be between 0 and 1 only, so for this let's divide the above
equation by (1-y):

o But we need range between -[infinity] to +[infinity], then take logarithm of the
equation it will become:

The above equation is the final equation for Logistic Regression.


 How Does the Logistic Regression Algorithm Work?

Consider the following example: An organization wants to determine an employee’s salary


increase based on their performance.

For this purpose, a linear regression algorithm will help them decide. Plotting a regression
line by considering the employee’s performance as the independent variable, and the salary
increase as the dependent variable will make their task easier.

Now, what if the organization wants to know whether an employee would get a promotion
or not based on their performance? The above linear graph won’t be suitable in this case. As
such, we clip the line at zero and one, and convert it into a sigmoid curve (S curve).

Based on the threshold values, the organization can decide whether an employee will get a
salary increase or not.
To understand logistic regression, let’s go over the odds of success.

Odds (𝜃) = Probability of an event happening / Probability of an event not happening

𝜃=p/1-p

The values of odds range from zero to ∞ and the values of probability lies between zero and
one.

Consider the equation of a straight line:

𝑦 = 𝛽0 + 𝛽1* 𝑥

Here, 𝛽0 is the y-intercept

𝛽1 is the slope of the line

x is the value of the x coordinate

y is the value of the prediction

Now to predict the odds of success, we use the following formula:

Exponentiating both the sides, we have:


Let Y = e 𝛽0+𝛽1 * 𝑥

Then p(x) / 1 - p(x) = Y

p(x) = Y(1 - p(x))

p(x) = Y - Y(p(x))

p(x) + Y(p(x)) = Y

p(x)(1+Y) = Y

p(x) = Y / 1+Y

The equation of the sigmoid function is:

The sigmoid curve obtained from the above equation is as follows:


 Linear Regression vs. Logistic Regression

Linear Regression Logistic Regression

Used to solve classification


Used to solve regression problems
problems

The response variable is categorical


The response variables are continuous in nature
in nature

It helps estimate the dependent variable when there is a It helps to calculate the possibility
change in the independent variable of a particular event taking place
It is a straight line It is an S-curve (S = Sigmoid)

 Applications of Logistic Regression

1. Using the logistic regression algorithm, banks can predict whether a customer would
default on loans or not.
2. To predict the weather conditions of a certain place (sunny, windy, rainy, humid,
etc.)
3. Ecommerce companies can identify buyers if they are likely to purchase a certain
product
4. Companies can predict whether they will gain or lose money in the next quarter,
year, or month based on their current performance
5. To classify objects based on their features and attributes

 Type of Logistic Regression:

On the basis of the categories, Logistic Regression can be classified into three types:

o Binomial: In binomial Logistic regression, there can be only two possible types of the
dependent variables, such as 0 or 1, Pass or Fail, etc.
o Multinomial: In multinomial Logistic regression, there can be 3 or more possible
unordered types of the dependent variable, such as "cat", "dogs", or "sheep"
o Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types
of dependent variables, such as "low", "Medium", or "High".

 Advantages of Logistic Regression


1. Logistic regression is easier to implement, interpret, and very efficient to train.
2. It is very fast at classifying unknown records.
3. Good accuracy for many simple data sets and it performs well when the dataset is
linearly separable.
4. Logistic regression is less inclined to over-fitting but it can overfit in high dimensional
datasets.One may consider Regularization (L1 and L2) techniques to avoid over-
fittingin these scenarios.
 Disadvantages of Logistic Regression
1. If the number of observations is lesser than the number of features, Logistic
Regression should not be used, otherwise, it may lead to overfitting.
2. Non-linear problems can’t be solved with logistic regression because it has a linear
decision surface. Linearly separable data is rarely found in real-world scenarios.
3. Logistic Regression requires average or no multicollinearity between independent
variables.
4. It is tough to obtain complex relationships using logistic regression. More powerful
and compact algorithms such as Neural Networks can easily outperform this
algorithm.
5. In Linear Regression independent and dependent variables are related linearly. But
Logistic Regression needs that independent variables are linearly related to the log
odds (log(p/(1-p)).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy