12 Bias-Variance - Underfit - Overfit
12 Bias-Variance - Underfit - Overfit
Low Bias: Predicting less assumption about Target Function(Predict Value >> Actual Value)
High Bias: Predicting more assumption about Target Function (Predict Value >>>>>> Actual Value)
Examples of low-bias machine learning algorithms include Decision Trees, k-Nearest Neighbors and Support
Vector Machines.
Examples of high-bias machine learning algorithms include Linear Regression, Linear Discriminant Analysis,
and Logistic Regression.
What Is Variance ?
Variance is the amount that the estimate of the target function will change if different training data was
used.
It determine how spread of predcit value each other
- Low Variance: Predicting small changes to the estimate of the target function
with changes to the training dataset.
- High Variance: Predicting large changes to the estimate of the target function
with changes to the training dataset.
Examples of low-variance machine learning algorithms include Linear Regression, Linear Discriminant
Analysis, and Logistic Regression.
Examples of high-variance machine learning algorithms include Decision Trees, k-Nearest Neighbors and
Support Vector Machines.
Till now we have some basic knowledge about Bias and Variance, now let's
know some other important terminology used in Bias and Variance that are
Underfitting and Overfitting.
Solution 1 is trivial. Concerning solution 2, an example an be the following: if someone is fitting a linear
regression to some data, then increasing the complexity would mean to fit a polynomial model.
Overfit :
These models have low bias and high variance.
overfitting happens when our model captures the noise along with the underlying pattern in data.
It happens when we train our model a lot over the noisy dataset.
These models are very complex like Decision trees which are prone to overfitting.
Here Training Accuracy is 99% But Testing Accuracy is around 50%. there is huge difference between
training and testing score this tends to overfit
Overfitting is when the model’s error on the training set (i.e. during training) is very low but then, the
model’s error on the test set (i.e. unseen samples) is large!
2. Reduce the complexity of the model (make the model less complex).
When it comes to solution 1 i.e. the use of cross-validation, the most famous CV scheme is the KFolds
cross-validation. Using a KFolds scheme, we train and test your model k-times on different subsets of the
training data and estimate a performance metric using the test (unseen) data.
When it comes to solution 2 i.e. reducing the complexity of the model can help reduce the overfitting. For
example, if someone is using an SVM model with RBF kernel then reducing the complexity would mean to
use a linear kernel. In another case, if someone is fitting a polynomial to some data, then reducing the
complexity would mean to fit a linear model instead (linear regression).
There is no escaping the relationship between bias and variance in machine learning.
If the algorithm is too simple then it may be on high bias and low variance condition and thus is error-prone.
If algorithms fit too complex then it may be on high variance and low bias. In the latter condition, the new
entries will not perform well. Well, there is something between both of these conditions, known as Trade-off
or Bias Variance Trade-off.
The error to complexity graph to show trade-off is given as –
.
This is referred to as the best point chosen for the training of the algorithm which gives low error in training as
well as testing data.