Classification Models
Classification Models
Logistic Regression:
Explanation:
used for binary classification problems (i.e., response variable is binary (0 or 1)).
In this case, we are predicting whether a car's miles per gallon (mpg) is above or below the mean value.
When to Use:
When the relationship between the predictor variables and the response variable is approximately
linear.
Logistic Regression is chosen when the response variable is categorical, and in this example, it's whether
the mpg is above or below the mean.
Suitable for problems where the outcome is binary, like whether an email is spam or not.
Predictors:
# =======================================
# =======================================
library(caret)
library(dplyr)
summary(data)
set.seed(123)
# Split the data into training (80%) and testing (20%) sets
# Summary statistics
summary(log_model)
conf_matrix
=======================
Discriminant Analysis:
Explanation:
Discriminant Analysis is used when there are two or more classes and the goal is to find the linear
combination of features that best separates them.
When you have more than two classes and you want to classify new observations into one of them.
Predictors:
Explanation:
Naive Bayes is a probabilistic algorithm based on Bayes' theorem, assuming independence between
predictors.
Despite its "naive" assumption, it performs surprisingly well in many real-world situations.
When to Use:
Predictors:
Explanation:
SVM is a powerful classification algorithm that finds the hyperplane that best separates data points of
different classes.
It works well in high-dimensional spaces and is effective in cases where the number of dimensions is
greater than the number of samples.
When to Use:
Predictors:
Works with numeric predictors; it's essential to scale the data for SVM.
Plots:
Commonly used plots include ROC curves, confusion matrices, and decision boundaries.
SVM: