0% found this document useful (0 votes)

10 views

KNN_Bias_Variance_Classification_Metrics (1)

bias variance tradeoff

Uploaded by

Aman Desai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

KNN_Bias_Variance_Classification_Metrics (1)

bias variance tradeoff

Uploaded by

Aman Desai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 81

K-NN, Bias Variance Trade-off

and Classification Metrics

Nirav P Bhatt
Department of Data Science and AI
Robert Bosch Centre for Data Science and Artificial Intelligence
Indian Institute of Technology Madras, Chennai – 600036, India
K Nearest Neighbors Classifier
• Data: {(x1, y1), (x2, y2), ….., (xn, yn)}
• Features: (x1,x2,….xp): xi
• Label: yi
• New test data xo
• What is the corresponding label?

• Instant based Classifier

• Use the data (or training data) for classification (no models)
• Non-parametric method

26-11-2024 2
K Nearest Neighbors Classifier
• How can we find the new Label?
• Old adage: Something walks and talks like peacock beware of statistics
it may be hen
• kNN Idea: Something walks and talks like peacock it is high likely to be
peacock not hen

26-11-2024 3
K Nearest Neighbors Classifier
x: Class I and 0: Class II
Class I • kNN classifier
0
x 0 0 • Training Data:
0 x
x + 0 0 {(x1, y1), (x2, y2), ….., (xn, yn)}
0
x +
Class II
• A distance Metric
x 0 0
x x • Number of neighbors: K
x 0
x
x
26-11-2024 4
K Nearest Neighbors Classifier
Algorithm
1. Data {(x1, y1), (x2, y2), ….., (xn, yn)}
2. For new data point, xo
3. Find the nearest point(s)

4. Label yo=yn* based on majority votes

26-11-2024 5
K Nearest Neighbors Classifier
Example:
x: Class I and 0: Class II
0 • K=3
x 0 0 • Compute conditional
0 x
x + xo 0 0 probability
0
x + • P(Y=Class I | x=xo)= 0.67
x 0 0
x x • P (Y= Class II| x=xo)=0.33
x 0
x
x
26-11-2024 6
K Nearest Neighbors Classifier
2-class classification problem with 2 features

26-11-2024 7
K Nearest Neighbors Classifier
2-class classification problem with 2 features

26-11-2024 8
K Nearest Neighbors Classifier
2-class classification problem with 2 features

26-11-2024 9
K Nearest Neighbors Classifier
• Choice of K
• Large K value
• Less flexible model
• Small K value
• Flexible model
• But sensitive to noisy data point

26-11-2024 10
K Nearest Neighbors Classifier
How do we decide the “K”?

26-11-2024 11
K Nearest Neighbors Classifier
How do we decide the “K”?

Test error rate

Training error rate

1 James, G., Witten, D., Hastie, T., and Tibshirani, R. An Introduction to statistical learning, 2021
26-11-2024 12
K Nearest Neighbors Classifier
How do we decide the “K”?

1 James, G., Witten, D., Hastie, T., and Tibshirani, R. An Introduction to statistical learning, 2021
26-11-2024 13
Flexible vs Inflexible Models

y y y

x x x

26-11-2024 14
Flexibility and Interpretability of Models
Complexity of Models
High
LASSO
Subset selection
Least-squares

Generalized Additive Models

Decision Tree

Interpretability
Boosting, Bagging
Support vector machines

Deep Learning
Low

Low
Flexibility High
26-11-2024 1 James, 15
G., Witten, D., Hastie, T., and Tibshirani, R. An Introduction to statistical learning, 2021
Irreducible and Reducible Errors
Mean Square Error between the actual and predicted y
መ 𝑝)
using the fit 𝑓(𝑥, ො

Irreducible Error

Reducible Error

26-11-2024 16
Bias-Variance Trade-off

26-11-2024 17
Bias-Variance Trade-off
kNN Classifier Linear Regression

26-11-2024 18
Bias-Variance Trade-off and Prediction error
kNN MSE

26-11-2024 19
Bias-Variance Trade-off
Under fitting Over fitting
Optimal Model
Total error

Error

Variance
Bias2
Irreducible error

Model Complexity (# parameters)

26-11-2024 20
Model Selection and Assessment
• Model selection is important in multiple linear and nonlinear
models
• Data-rich situation: Randomly divide the data in three parts
Ideal Scenario: Data-rich situation

Train Validate Test

50% 25% 25%

Fit the models Model selection Model assessment

Practice: Limited Amount of Data

Best Model in Practice? Need a Criterion
26-11-2024 21
Resampling Methods
• Validation of models by repeatedly drawing random
samples from a training set
• K-fold cross validation
• Bootstrap
• Objective:
• Predict the performance of model(s) on the test sets using
the training sets
• Resampling methods useful for data scarce situations

26-11-2024 23
Resampling Methods
• Consider the following data set
• Training set: {(x1, y1);(x2, y2);…; (xn, yn)}
• Test point: (x0, y0) such nt observations
• Training error rate Not of our interest
for predictive
ability of the model
• Test error rates

Of our interest

Data scarcity: Test data are not available

26-11-2024 24
Resampling Methods
• Expected test error

Irreducible
error Variance Bias

• Interpretation of variance: The amount by which 𝑚𝑜𝑑𝑒𝑙

would change if we estimate it using different training sets
• Interpretation of Bias: The amount of error introduced by
approximating a problem with a simpler model
• Select the model that achieves low variance and low bias

26-11-2024 25
Validation Set Approach
• Enough data: (1) Training set, (2) Validation set, and (3) Test
set
• Not enough data: Generate validation sets from a training set
• Validation set approach: Divides (often randomly) the training
set into two parts
1234 n
• A training set 1234 nt
• A validation set (or hold-out set) 1234 nv
• Use training set, to fit the model
• Use validation set, to predict validation set errors
Provides an estimate of test error rates

26-11-2024 26
Validation Set Approach: Example
• Example: mileage~ horsepower1
• Nonlinear Model: mileage~f(horsepower)

High variability in
MSE

MSE
estimates of test error

1Tibshirani et al (2013)
26-11-2024 27
Leave-one-out-cross-validation (LOOCV)
• Build model using (n-1) samples and predict
the response (yi) for the remaining sample

1234 n
1234 n
1234 n

1234 n

26-11-2024 28
LOOCV: Example
• Example: mileage~ horsepower1
• Nonlinear Model: mileage~f(horsepower)
LOOCV Validation Set Approach

MSE
MSE

Degree of Polynomial Degree of Polynomial

1Tibshirani et al (2013)
26-11-2024 29
Leave-one-out-cross-validation (LOOCV)
• Advantages
• Far less bias comparison to the validation set approach
Training set contains (n-1) observations each iteration
• Yield the same results
No randomness in the training/validation set splits
• Does not overestimate the test error rate as much as the validation set
approach
• Disadvantages
• Expensive to implement due to fitting happens n times
• Asymptotical incorrect (n tends to infinity) it does not choose correct model
• It may select a model of excessive size (more variables) than the optimal
model

26-11-2024 30
k-Fold Cross Validation
• Training data into k disjoint samples of
equal size,
Z1, Z2…, Zk 1234 n
• For each validation sample Zi
• Use remaining data to fit the model 1
• Predict the response for the
validation sample Zi and compute
mean square error (MSEi), 2
• Repeat for all k samples
• The k-fold CV k

26-11-2024 31
k-fold Validation
• For k=n, Leave-one-out-cross-validation (LOOCV)
• In practice, k=5 or 10 is taken,
• Less computation cost
• For computationally intensive learning methods
• LOOCV fits the model n times
• k-fold CV fits the model k times

26-11-2024 32
k-fold CV: Example
• Example: mileage~ horsepower1
• Nonlinear Model: mileage~f(horsepower)
LOOCV

MSE
MSE

Degree of Polynomial
Degree of Polynomial

1Tibshirani et al (2013)
26-11-2024 33
k-fold CV: Example
• Example: mileage~ horsepower1
• Nonlinear Model: mileage~f(horsepower)
10-fold CV Validation Set Approach
MSE

MSE
Degree of Polynomial Degree of Polynomial
k-fold CV has lower variability in comparison to Validation Set Approach
1Tibshirani et al (2013)
26-11-2024 34
k-fold CV: Bias-Variance Trade-off
• Bias reduction in test error: LOOCV is preferred
• LOOCV provides nearly unbiased estimates: (n-1)
observations in training set
• k-fold CV provides intermediate level of biased estimates:
(k-1)n/k observations in training set
• Variance reduction in test error: k-fold CV
• LOOCV leads to higher variance: Training on almost
identical (n-1) observations
• k-fold CV (k<n) leads to lower variance: Training on (k-
1)n/k observations having overlap between the training
sets in each model is smaller
5- or 10-fold CV yields test error rate estimates having moderate bias and variances
26-11-2024 35
Cross-validation: Classification Problems
• Quantitative outcome yi of Regression problems
• In CV, MSE is used to quantify test error
• Classification problem: yi is qualitative
• CV?
• Use the number of misclassified observations
• LOOCV error rate

with Erri = I(yi≠ 𝑦ො𝑖 ) , I is an indicator function

26-11-2024 36
Bootstrap
Training
Z={z1,z2,z3,…,zn} sample

Bootstrap
Z*1 Z*2 Z*m samples

Bootstrap
S(Z*1) S(Z*2) S(Z*m) Replications
Of S(.) methods
parameters

26-11-2024 37
Bootstrap
• Normally used for quantifying the uncertainty associated with a
given estimator
• Training set: Z={z1,z2,…,zn} where zi=(xi,yi)
• Draw samples with replacement from the training set such that
each sample size = original training size
• Repeat the sampling for m times: m data sets Z*m
• Compute the quantity of interest (ex. Regression parameters)
from the each data set
• Estimation of prediction errors

26-11-2024 38
Bootstrap
• Estimation of prediction error

• MSEboot does not provide a good estimate, why?

• The original training set is acting as test set
• Boot strap sets are near to the training set
• A better bootstrap estimate of prediction error is

where C-i the set of indices of the sample m that not having ith
observation

26-11-2024 39
Bootstrap: Example
• Two instruments: A and B
• Property C= αA+(1- α)B, α is a parameter
• Variability associated with each instrument
• Objective : Choose α such that variance of C is minimized
• α value at minimum var(C) can be given by

• 𝜎𝐴2 , 𝜎𝐵2 , 𝜎𝐴𝐵

2
: Unknown
• Compute them using past data sets

26-11-2024 40
Bootstrap: Example

26-11-2024 Simulated α=0.6 41

Bootstrap: Example

• n=100 observations
• m= n Bootstrap samples
• Compute unknown estimates of
2 2 2
Quantities 𝜎ො𝐴 ,𝜎ො𝐵 , 𝜎ො𝐴𝐵 and
• 𝛼ො for each bootstrap sample using

26-11-2024
𝛼=0.5964
ො 42
Conclusion:
Choosing the Optimal Model?

Validation Set BIC

10-fold CV

26-11-2024 43
Conclusion:
Choosing the Optimal Model?

Validation Set BIC

One-standard-error rule
Compute standard error of the test MSE for each model
Select the smallest model for which the test error is within one
standard error of the lowest point on the curve
10-fold CV

26-11-2024 44
Classification Models
• Data: (x1,y1), (x2,y2),…,(xn,yn)
• Binary class problems
• Multi-class problems
• Underlying true distribution P(X, y)
• How well the underlying distribution learnt by a
Classifier?
• Questions
• How do we estimate the true performance of a
classifier?
• How good are the parameter estimates in the classifier?
8
Evaluation Metrics: Binary Classification
T + + + + - - - - - -
P + - + - + - - - - -
Outcome TP FN TP FN FP TN TN TN TN TN

TP: True Positive (Positive sample classified as positive class)

FN: False Negative (Positive sample classified as a negative class)
FP: False Positive (Negative sample classified as positive class)
TN: True Positive (Negative sample classified as negative class)

9
Evaluation Metrics: Binary Classification
Confusion Matrix (Contingency table)

Predicted

+ - Accuracy =
𝑇𝑃+𝑇𝑁
𝑃+𝑁
False
True Positive
+ (TP)
Negative
(FN) Misclassification rate=1- Accuracy
True

- False Positive True Negative

(FP) (TN)

10
Evaluation Metrics: Binary Classification
+ - + -
+ 15 5 + 18 2

- 30 950 - 20 960

Compute Accuracy

15+950 18+960
Accuracy = = 0.965 Accuracy = = 0.978
20+980 20+980

Accuracies for both classifier are quite identical.

For Highly imbalance data, Accuracies is not a good measure.
11
Evaluation Metrics: Binary Classification
Confusion Matrix (Contingency table)

Predicted
𝑇𝑃
Precision =
+ - 𝑇𝑃+𝐹𝑃

False
True Positive
+ (TP)
Negative
(FN) 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑠𝑎𝑚𝑝𝑙𝑒𝑠 𝑎𝑟𝑒 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑑
True 𝑎𝑠 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑏𝑦 𝑎 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑟
Precision = 𝑇𝑜𝑡𝑎𝑙 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑙𝑎𝑠𝑠 𝑠𝑎𝑚𝑝𝑙𝑒𝑠
- False Positive True Negative
(FP) (TN) 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑏𝑦 𝑎 𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑒𝑟

The fraction of true positive classes corrected predicted by a classifier in

all the positive classes predicted by the classifier

12
Evaluation Metrics: Binary Classification
Confusion Matrix (Contingency table)

Predicted 𝑇𝑃
Recall =
𝑇𝑃+𝐹𝑁
+ -
False The fraction of true positives predicted by a
True Positive
+ (TP)
Negative
(FN)
Classifier wrt to the true positive
True

- False Positive True Negative

(FP) (TN)

13
Evaluation Metrics: Binary Classification
+ - + -
+ 15 5 + 18 2

- 30 950 - 10 970

15 18
Precision = 0.67 = 0.65
15 + 30 18 + 10

18
Recall 15
=0.75 =0.9
18+2
15+5

14
Evaluation Matrix: Binary Classification
Confusion Matrix (Contingency table)

Predicted
𝑇𝑁
+ - Specificity =
𝐹𝑃+𝑇𝑁

Recall, sensitivity =
𝑇𝑃
False 𝑇𝑃+𝐹𝑁
True Positive
+ (TP)
Negative
(FN)
True

- False Positive True Negative

(FP) (TN)

15
Evaluation Metrics: Binary Classification
𝑇𝑁 𝑇𝑃
Specificity = Recall, sensitivity =
𝐹𝑃+𝑇𝑁 𝑇𝑃+𝐹𝑁

RT-PCR vs Cancer (Mammogram) Test

16
Evaluation Metrics: Binary Classification
𝑇𝑁 𝑇𝑃
Specificity = Recall, sensitivity =
𝐹𝑃+𝑇𝑁 𝑇𝑃+𝐹𝑁

RT-PCR vs Cancer (Mammogram) Test

17
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curve

𝑇𝑃
True Positive Rate =
𝑇𝑃+𝐹𝑁
𝐹𝑃
False Positive Rate =
𝐹𝑃+𝑇𝑁

(ROC) Curve:
A graph between FPR vs TPR

19
Evaluation Matrix: Binary Classification
𝑇𝑃
True Positive Rate =
Receiver Operating Characteristic (ROC) Curve 𝑇𝑃+𝐹𝑁
𝐹𝑃
False Positive Rate =
𝐹𝑃+𝑇𝑁

20
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curve

𝑇𝑃
True Positive Rate =
𝑇𝑃+𝐹𝑁

𝐹𝑃
False Positive Rate =
𝐹𝑃+𝑇𝑁
TPR

FPR
21
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curve
𝑇𝑃 𝐹𝑃
True Positive Rate = False Positive Rate =
𝑇𝑃+𝐹𝑁 𝐹𝑃+𝑇𝑁

Points Probability Threshold TPR FPR

+ 0.9
+ 0.85
+ 0.6
- 0.49
- 0.4
- 0.35
- 0.34
+ 0.32
- 0.2
- 0.1
22
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curve
𝑇𝑃 𝐹𝑃
True Positive Rate = False Positive Rate =
𝑇𝑃+𝐹𝑁 𝐹𝑃+𝑇𝑁
Points Probabil Thresh TPR FPR
ity old
+ 0.9
+ 0.85
+ 0.6
- 0.49
- 0.4
- 0.35
- 0.34
+ 0.32
- 0.2
- 0.1
23
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curve

24
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curve

25
Evaluation Matrix: Binary Classification
Receiver Operating Characteristic (ROC) Curve

26
Evaluation Metrics: Binary Classification
Comparing Receiver Operating Characteristic (ROC) Curves

𝑇𝑃
True Positive Rate =
𝑇𝑃+𝐹𝑁

𝐹𝑃
False Positive Rate =
𝐹𝑃+𝑇𝑁
TPR

FPR
27
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curves

TPR

FPR
28
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curves

TPR

FPR
29
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curves

TPR

FPR
30
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curves

TPR

FPR
31
Evaluation Metrics: Binary Classification

32
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curves

TPR

FPR
35
Evaluation Metrics: Binary Classification
Receiver Operating Characteristic (ROC) Curves

TPR

FPR
36
Evaluation Metrics: Binary Classification
Precision Recall Curve

37
Evaluation Metrics: Multi-Class Classification
Metrics

38
Evaluation Metrics: Multi-Class Classification
Metrics

39
Evaluation Metrics: Multi-Class Classification
Metrics

40
Evaluation Metrics: Multi-Class Classification
Metrics

41
Evaluation Metrics: Multi-Class Classification
Metrics

42
Evaluation Metrics: Multi-Class Classification
Metrics

43
Evaluation Metrics: Multi-Class Classification
Metrics

44
Evaluation Metrics: Multi-Class Classification
Metrics

45
Evaluation Metrics: Multi-Class Classification
Metrics

46
Evaluation Metrics: Multi-Class Classification
Metrics

47
References:
1. Tom Fawcett, An Introduction to ROC Analysis, Pattern Recognition Letters, 2006, 861-874
2. Alaa Tharwat, Classification Assessment Methods, Applied Computing and Informatics Vol. 17 No. 1, 2021
pp. 168-192

Revenue Cycle Management
100% (2)
Revenue Cycle Management
42 pages
Millipore Academic
No ratings yet
Millipore Academic
93 pages
MI_Unit 5
No ratings yet
MI_Unit 5
72 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
4-ResamplingMethods 1
No ratings yet
4-ResamplingMethods 1
23 pages
Ch5 Resampling Methods
No ratings yet
Ch5 Resampling Methods
66 pages
DATA ANALYSIS UNIT 4 Notes
No ratings yet
DATA ANALYSIS UNIT 4 Notes
19 pages
Week7_Lecture_1_ML_SPR25 (1)
No ratings yet
Week7_Lecture_1_ML_SPR25 (1)
23 pages
5 CV Boot-Handout PDF
No ratings yet
5 CV Boot-Handout PDF
44 pages
Classification
No ratings yet
Classification
4 pages
Statistical Learning: Master in Data Science For Management
No ratings yet
Statistical Learning: Master in Data Science For Management
47 pages
Resampling Methods - ML
No ratings yet
Resampling Methods - ML
115 pages
M1 - Evaluating Predictive Performance
No ratings yet
M1 - Evaluating Predictive Performance
58 pages
lec5
No ratings yet
lec5
28 pages
INSY662 - F23 - Week 3-1
No ratings yet
INSY662 - F23 - Week 3-1
22 pages
ML - Module 5
No ratings yet
ML - Module 5
80 pages
Unit IV
No ratings yet
Unit IV
51 pages
ML U-4
No ratings yet
ML U-4
63 pages
ML 19.03 Sidenotes
No ratings yet
ML 19.03 Sidenotes
30 pages
CV
No ratings yet
CV
37 pages
ML 5
No ratings yet
ML 5
14 pages
Wk07 Topic07 2 - 202303
No ratings yet
Wk07 Topic07 2 - 202303
21 pages
Lecture Slide 02 - Supervised Learning - Summer 2023
No ratings yet
Lecture Slide 02 - Supervised Learning - Summer 2023
43 pages
Theory in Machine Learning
No ratings yet
Theory in Machine Learning
60 pages
Accuracy Measures
No ratings yet
Accuracy Measures
61 pages
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
No ratings yet
CSO504 Machine Learning: Evaluation and Error Analysis Validation and Regularization Koustav Rudra 22/08/2022
28 pages
Machine Learning General: Definiton
No ratings yet
Machine Learning General: Definiton
14 pages
Intro To Data Science Lecture 5
No ratings yet
Intro To Data Science Lecture 5
7 pages
Model Generalization
No ratings yet
Model Generalization
117 pages
EDA Module 2
No ratings yet
EDA Module 2
28 pages
Jkkklphftbbhuii
No ratings yet
Jkkklphftbbhuii
17 pages
1.4 Intro To Need of Estimation and Validation PDF
No ratings yet
1.4 Intro To Need of Estimation and Validation PDF
18 pages
07 - Evaluating Performance
No ratings yet
07 - Evaluating Performance
46 pages
Resampling Methods Class 2
No ratings yet
Resampling Methods Class 2
38 pages
EDAN96_2024_Last_lecture-1
No ratings yet
EDAN96_2024_Last_lecture-1
78 pages
Bias Variance Tradeoff
No ratings yet
Bias Variance Tradeoff
71 pages
ML PPT LECT_4
No ratings yet
ML PPT LECT_4
16 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
No ratings yet
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
117 pages
ML Notes (Module-3)
No ratings yet
ML Notes (Module-3)
21 pages
Lecture16 Crossvalidation
No ratings yet
Lecture16 Crossvalidation
32 pages
SLChapter 4
No ratings yet
SLChapter 4
20 pages
ML3 - Evaluation
100% (1)
ML3 - Evaluation
65 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
10 pages
Resampling Methods Class 1
No ratings yet
Resampling Methods Class 1
33 pages
2023 LSE MY474 Applied Machine Learning Social Science, Lecture4
No ratings yet
2023 LSE MY474 Applied Machine Learning Social Science, Lecture4
57 pages
Lecture 5 Evaluation_Classifer
No ratings yet
Lecture 5 Evaluation_Classifer
61 pages
15-The Bias - Variance - Trade-Off-08-04-2024
No ratings yet
15-The Bias - Variance - Trade-Off-08-04-2024
23 pages
T1 ML QB Soln
No ratings yet
T1 ML QB Soln
23 pages
Unit 5 New
No ratings yet
Unit 5 New
9 pages
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
100% (2)
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
26 pages
ML Unit 2 Part 1
No ratings yet
ML Unit 2 Part 1
47 pages
Lec-01-Introduction to Statistical Learning
No ratings yet
Lec-01-Introduction to Statistical Learning
38 pages
Training Evaluation
No ratings yet
Training Evaluation
42 pages
ML 04 Validation Regularization
No ratings yet
ML 04 Validation Regularization
57 pages
Machine Learning Using Matlab: Lecture 8 Advice On ML Application
No ratings yet
Machine Learning Using Matlab: Lecture 8 Advice On ML Application
30 pages
IML-Summary
No ratings yet
IML-Summary
12 pages
Week11_regularization and optimization
No ratings yet
Week11_regularization and optimization
75 pages
1 5 Bias Variance Trade Off
No ratings yet
1 5 Bias Variance Trade Off
34 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Contextual Image Classification: Understanding Visual Data for Effective Classification
From Everand
Contextual Image Classification: Understanding Visual Data for Effective Classification
Fouad Sabry
No ratings yet
Logistic_Regression (1)
No ratings yet
Logistic_Regression (1)
15 pages
18CS654-Introduction To Operating System
No ratings yet
18CS654-Introduction To Operating System
2 pages
VLSI Module 2
No ratings yet
VLSI Module 2
57 pages
MC Mod2@Azdocuments - in
No ratings yet
MC Mod2@Azdocuments - in
18 pages
LNL Iklcqd /: Employee Share Employer Share Employee Share Employer Share
No ratings yet
LNL Iklcqd /: Employee Share Employer Share Employee Share Employer Share
2 pages
Physics Module 4 Question Bank 2018 19
No ratings yet
Physics Module 4 Question Bank 2018 19
2 pages
Chapter 5 Review of Physical Metallurgy
100% (2)
Chapter 5 Review of Physical Metallurgy
9 pages
Android Users Guide
No ratings yet
Android Users Guide
81 pages
aaNDIC QUARTERLY VOL 33 NO 3 4 Article CAPITAL STRUCTURE AND PERFORMANCE OF DEPOSIT MONEY BANKS IN NIGERIA 2 PDF
No ratings yet
aaNDIC QUARTERLY VOL 33 NO 3 4 Article CAPITAL STRUCTURE AND PERFORMANCE OF DEPOSIT MONEY BANKS IN NIGERIA 2 PDF
31 pages
Petrol Pump Design Calculations
No ratings yet
Petrol Pump Design Calculations
8 pages
Departmental Enquiry - Important Case Laws
No ratings yet
Departmental Enquiry - Important Case Laws
12 pages
Pdfcoffee Teddy Diego - Google Search
0% (1)
Pdfcoffee Teddy Diego - Google Search
2 pages
FALLSEM2020-21 ECE2003 ETH VL2020210101783 Reference Material I 14-Jul-2020 DLD Satheesh
No ratings yet
FALLSEM2020-21 ECE2003 ETH VL2020210101783 Reference Material I 14-Jul-2020 DLD Satheesh
127 pages
H1B Procedure
No ratings yet
H1B Procedure
1 page
Unit 6 - GST - 2024
No ratings yet
Unit 6 - GST - 2024
10 pages
Registros ATV71
No ratings yet
Registros ATV71
110 pages
Burger Box - Business Plan
No ratings yet
Burger Box - Business Plan
3 pages
Isc 2SD998: Silicon NPN Power Transistor
No ratings yet
Isc 2SD998: Silicon NPN Power Transistor
2 pages
Dell Precision 3930 Rack Spec Sheet
No ratings yet
Dell Precision 3930 Rack Spec Sheet
6 pages
Kashish_140 Report-2 (3)
No ratings yet
Kashish_140 Report-2 (3)
50 pages
Business Skills
No ratings yet
Business Skills
16 pages
Samuel Annobil .CV
No ratings yet
Samuel Annobil .CV
3 pages
Team of Rivals_ the Political Genius of Abraham Lincoln
No ratings yet
Team of Rivals_ the Political Genius of Abraham Lincoln
5 pages
PDF Buyers Importers Heimtextil Frankfurt Germany Jan 2018 - Compress
100% (1)
PDF Buyers Importers Heimtextil Frankfurt Germany Jan 2018 - Compress
15 pages
Name
No ratings yet
Name
18 pages
Problem Set 2, L and E 2021
No ratings yet
Problem Set 2, L and E 2021
2 pages
Im Listbooks
No ratings yet
Im Listbooks
103 pages
Marine Engine Programme bbb4
100% (1)
Marine Engine Programme bbb4
242 pages
HTW 10-1 - For The Tenth Session of The Sub-Committee To Be Held at IMO Headquarters, 4 Albert Embank... (Secretariat)
No ratings yet
HTW 10-1 - For The Tenth Session of The Sub-Committee To Be Held at IMO Headquarters, 4 Albert Embank... (Secretariat)
3 pages
HT System
0% (1)
HT System
11 pages
487bfd33-175b-48be-b8e4-136aa976ee48
No ratings yet
487bfd33-175b-48be-b8e4-136aa976ee48
2 pages
Business District Overcomes Protests: Airline Deal Ok
No ratings yet
Business District Overcomes Protests: Airline Deal Ok
27 pages
Im Entrepreneurship
No ratings yet
Im Entrepreneurship
48 pages
Anupam Mittal V Westbridge II InvestmentHoldings
No ratings yet
Anupam Mittal V Westbridge II InvestmentHoldings
4 pages
Glenarm Company
No ratings yet
Glenarm Company
2 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

KNN_Bias_Variance_Classification_Metrics (1)

Uploaded by

KNN_Bias_Variance_Classification_Metrics (1)

Uploaded by

K-NN, Bias Variance Trade-off

and Classification Metrics

• Instant based Classifier

4. Label yo=yn* based on majority votes

Test error rate

Training error rate

Generalized Additive Models

Model Complexity (# parameters)

Train Validate Test

Fit the models Model selection Model assessment

Practice: Limited Amount of Data

Data scarcity: Test data are not available

• Interpretation of variance: The amount by which 𝑚𝑜𝑑𝑒𝑙

Degree of Polynomial Degree of Polynomial

with Erri = I(yi≠ 𝑦ො𝑖 ) , I is an indicator function

• MSEboot does not provide a good estimate, why?

• 𝜎𝐴2 , 𝜎𝐵2 , 𝜎𝐴𝐵

26-11-2024 Simulated α=0.6 41

Validation Set BIC

Validation Set BIC

TP: True Positive (Positive sample classified as positive class)

- False Positive True Negative

Accuracies for both classifier are quite identical.

The fraction of true positive classes corrected predicted by a classifier in

- False Positive True Negative

- False Positive True Negative

RT-PCR vs Cancer (Mammogram) Test

RT-PCR vs Cancer (Mammogram) Test

Points Probability Threshold TPR FPR

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.