0% found this document useful (0 votes)

16 views

8

The document discusses cross-validation techniques in machine learning, emphasizing the importance of not using the entire dataset for training to evaluate model performance on unseen data. It covers various methods including Holdout, K-Fold, Leave-One-Out, and Bootstrap, explaining their procedures and applications. Additionally, it provides examples of implementing these techniques using logistic regression on a diabetes dataset.

Uploaded by

Atiya Falak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

8

Uploaded by

Atiya Falak

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

Machine Learning

Cross Validation

Machine Learning
by Tom M. Mitchell

Muhammad Affan Alim

1
Cross Validation
• In machine learning is to not use the entire data set when training
a learner.

• Some of the data is removed before training begins.

• Then when training is done, the data that was removed can be
used to test the performance of the learned model on ``new''
data.

• This is the basic idea for a whole class of model evaluation

methods called cross validation

2
Cross Validation cont…
• Method of estimating expected predicting error

• Helps selecting the best fit model

• Helps ensuring model is not overfit

3
Cross Validation cont…
Types
1) Holdout method
2) K-Fold CV
3) Leave one out CV
4) Bootstraps Methods

4
Holdout method
• The holdout cross validation method is the simplest of all. In this
method, you randomly assign data points to two sets. The size of
the sets does not matter

5
K-fold
• K-fold cross validation is one way to improve over the holdout
method. The data set is divided into k subsets, and the holdout
method is repeated k times
• Each time, one of the k subsets is used as the test set and the
other k-1 subsets are put together to form a training set

6
Leave one out CV
• Leave-one-out cross validation is K-fold cross validation taken to its
logical extreme, with K equal to N, the number of data points in
the set
• That means that N separate times, the function approximate is
trained on all the data except for one point and a prediction is
made for that point

• As before the average error is computed and used to evaluate the

model.

7
Leave one out cont…
• Specific case of K-fold validation

• Normally use in biometrics recognition

8
Bootstrap
• Randomly draw datasets from the training sample
• Each sample same size as the training sample
• Refit the model with the boothstrap samples
• Examine the model

9
Bootstrap cont…
• Example of bootstrap

10
Logistic Regression with Cross Validation

11
Logistic Regression
# evaluate a logistic regression model using k-fold cross-validation
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

12
Logistic Regression
# create dataset
X, y = make_classification(n_samples=1000, n_features=20,
n_informative = 15, n_redundant=5, random_state=1)
# prepare the cross-validation procedure
cv1 = KFold(n_splits=10, random_state=12, shuffle=True)
# create model
model = LogisticRegression(max_iter=300)

13
Logistic Regression
# evaluate model
scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv1, n_jobs=-1)
# report performance
print('Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))

14
Repeated k-Fold Cross-Validation
• The estimate of model performance via k-fold cross-validation can be
noisy.

• This means that each time the procedure is run, a different split of the
dataset into k-folds can be implemented, and in turn, the distribution
of performance scores can be different, resulting in a different mean
estimate of model performance.

15
Repeated k-Fold Cross-Validation
# evaluate a logistic regression model using repeated k-fold cross-
validation
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import RepeatedKFold
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

16
Repeated k-Fold Cross-Validation
# create dataset
X, y = make_classification(n_samples=1000, n_features=20,
n_informative=15, n_redundant=5, random_state=1)
# prepare the cross-validation procedure
cv = RepeatedKFold(n_splits=10, n_repeats=3, random_state=1)
# create model
model = LogisticRegression()

17
Repeated k-Fold Cross-Validation
# evaluate model
scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report performance
print('Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))

18
Repeated k-Fold Cross-Validation
• Running the example creates the dataset, then evaluates a logistic
regression model on it using 10-fold cross-validation with three
repeats. The mean classification accuracy on the dataset is then
reported.

19
stratified k-Fold Cross-Validation
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

20
stratified k-Fold Cross-Validation
# Import necessary modules
#from sklearn.metrics import mean_squared_error
#from math import sqrt
# from sklearn import model_selection
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import StratifiedKFold
from sklearn.linear_model import LogisticRegression

21
stratified k-Fold Cross-Validation
df = pd.read_csv(‘./dataset/diabetes.csv')

x1 = df.drop(‘Outcome’, axis=1)
y1 = df[‘Outcome’]

22
stratified k-Fold Cross-Validation
skfold = StratifiedKFold(n_splits=3, random_state=100)
model_skfold = LogisticRegression()
results_skfold = cross_val_score(model_skfold, x1, y1, scoring=‘accuracy’,
cv=skfold)
print("Accuracy: %.2f%%" % (results_skfold.mean()*100.0))

23
stratified k-Fold Cross-Validation Classification Report
>> from sklearn.metrics import classification_report, accuracy_score,
make_scorer

>> def classification_report_with_accuracy_score(y_true, y_pred):

>> print (classification_report(y_true, y_pred) )
>> # print classification report
>> return accuracy_score(y_true, y_pred) # return accuracy score

model_skfold = LogisticRegression() 24
stratified k-Fold Cross-Validation Classification Report
>> # Nested CV with parameter optimization
>> nested_score = cross_val_score(model_skfold, X=x1, y=y1, cv=skfold,
scoring = make_scorer(classification_report_with_accuracy_score))
>> print(nested_score )

25
Multi-class classifcation
>> from sklearn import datasets
>> from sklearn.metrics import confusion_matrix
>> from sklearn.model_selection import train_test_split
>> # loading the iris dataset
>> iris = datasets.load_iris()

# X -> features, y -> label

X = iris.data
y = iris.target

26
Multi-class classifcation
>> # dividing X, y into train and test data
>> X_train, X_test, y_train, y_test = train_test_split(X, y, random_state =
0)

>> # training a linear SVM classifier

>> from sklearn.svm import SVC
>> svm_model_linear = SVC(kernel = 'linear', C = 1).fit(X_train, y_train)
>> svm_predictions = svm_model_linear.predict(X_test)

27
Multi-class classifcation
>> # model accuracy for X_test
>> accuracy = svm_model_linear.score(X_test, y_test)
>> print(classification_score(svm_predictions,y_test))

>> # creating a confusion matrix

>> cm = confusion_matrix(y_test, svm_predictions)

28
Multi-class classification-cross validation
# Nested CV with parameter optimization
>> nested_score = cross_val_score(model_skfold, X=X, y=y,
cv=skfold,scoring=make_scorer(classification_report_with_accuracy_scor
e))
>> print(nested_score)

29
All methods of Cross Validation

30
All methods of Cross Validation - Introduction
• Building machine learning models is an important element of
predictive modelling. However, without proper model validation, the
confidence that the trained model will generalize well on the unseen
data can never be high.

• Model validation helps in ensuring that the model performs well on

new data, and helps in selecting the best model, the parameters, and
the accuracy metrics.

31
All methods of Cross Validation - Introduction
• Hold Out Validation
• K-fold Cross-Validation.
• Stratified K-fold Cross-Validation
• Leave One Out Cross-Validation.
• Repeated Random Test-Train Splits

32
All methods of Cross Validation – Diabetes data set
detail
• pregnancies - Number of times pregnant.
• glucose - Plasma glucose concentration.
• diastolic - Diastolic blood pressure (mm Hg).
• triceps - Skinfold thickness (mm).
• insulin - Hour serum insulin (mu U/ml).
• bmi - BMI (weight in kg/height in m).
• dpf - Diabetes pedigree function.
• age - Age in years.
• diabetes - “1” represents the presence of diabetes while “0”
represents the absence of it. This is the target variable.

33
All methods of Cross Validation - Introduction
Steps
• In this guide, we will follow the following steps:
• Step 1 - Loading the required libraries and modules.
• Step 2 - Reading the data and performing basic data checks.
• Step 3 - Creating arrays for the features and the response variable.
• Step 4 - Trying out different model validation techniques.

• The following sections will cover these steps.

34
All methods of Cross Validation - Introduction
Step 1 - Loading the Required Libraries and Modules

# Import required libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn

35
All methods of Cross Validation - Introduction
# Import necessary modules
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from math import sqrt
# from sklearn import model_selection
from sklearn.linear_model import LogisticRegression

36
All methods of Cross Validation - Introduction
from sklearn.model_selection import KFold
from sklearn.model_selection import LeaveOneOut
from sklearn.model_selection import LeavePOut
from sklearn.model_selection import ShuffleSplit
from sklearn.model_selection import StratifiedKFold

37
All methods of Cross Validation - Introduction
Step 2 - Reading the Data and Performing Basic Data Checks

• The first line of code below reads in the data as a pandas dataframe,
while the second line prints the shape - 768 observations of 9
variables. The third line gives the transposed summary statistics of the
variables.

38
All methods of Cross Validation - Introduction
dat = pd.read_csv('diabetes.csv')
print(dat.shape)
dat.describe(). transpose()
Output:

39
All methods of Cross Validation - Introduction
• Looking at the summary for the 'diabetes' variable, we observe that
the mean value is 0.35, which means that around 35 percent of the
observations in the dataset have diabetes. Therefore, the baseline
accuracy is 65 percent, and the model we build should definitely beat
this baseline benchmark.

40
All methods of Cross Validation - Introduction
• Step 3 - Creating Arrays for the Features and the Response Variable
• The lines of code below create an array of the features and the
dependent variable, respectively.

x1 = dat.drop('diabetes', axis=1).values
y1 = dat['diabetes'].values

41
All methods of Cross Validation - Introduction
• Step 4 - Trying out Different Model Validation Techniques

• With the arrays of the features and the response variable created, we
will start discussing the various model validation strategies.

42
Holdout Validation Approach - Train and Test Set Split
• The holdout validation approach refers to creating the training and the
holdout sets, also referred to as the 'test' or the 'validation' set.
• The training data is used to train the model while the unseen data is
used to validate the model performance.
• The common split ratio is 70:30, while for small datasets, the ratio can
be 90:10.

43
Holdout Validation Approach - Train and Test Set Split
# Evaluate using a train and a test set
X_train, X_test, Y_train, Y_test = model_selection.train_test_split(x1, y1,
test_size=0.30, random_state=100)
model = LogisticRegression()
model.fit(X_train, Y_train)
result = model.score(X_test, Y_test)
print("Accuracy: %.2f%%" % (result*100.0))

44
Holdout Validation Approach - Train and Test Set Split
Output:
Accuracy: 74.46%
• We can see that the accuracy for the model on the test data is
approximately 74 percent. The above technique is useful but it has
pitfalls
• The split is very important and, if it goes wrong, it can lead to model
overfitting or underfitting the new data. This problem can be rectified
using resampling methods

45
K-fold Cross-Validation
• In k-fold cross-validation, the data is divided into k folds. The model
is trained on k-1 folds with one fold held back for testing.

• This process gets repeated to ensure each fold of the dataset gets
the chance to be the held back set.

• Once the process is completed, we can summarize the evaluation

metric using the mean or/and the standard deviation.

46
K-fold Cross-Validation
kfold = model_selection.KFold(n_splits=8, random_state=100)
model_kfold = LogisticRegression() // object

results_kfold = model_selection.cross_val_score(model_kfold, x1, y1,

cv=kfold,scoring=‘accuracy’)

print("Accuracy: %.2f%%" % (results_kfold.mean()*100.0))

47
K-fold Cross-Validation
Output:
Accuracy: 76.95%

• The mean accuracy for the model using k-fold cross-validation is

76.95 percent, which is better than the 74 percent we achieved in
the holdout validation approach.

48
Stratified K-fold Cross-Validation
• Stratified K-Fold approach is a variation of k-fold cross-validation
that returns stratified folds, i.e., each set containing approximately
the same ratio of target labels as the complete data.

skfold = StratifiedKFold(n_splits=3, random_state=100)

model_skfold = LogisticRegression()
results_skfold = model_selection.cross_val_score(model_skfold, x1, y1,
cv=skfold)
print("Accuracy: %.2f%%" % (results_skfold.mean()*100.0))

49
Stratified K-fold Cross-Validation
• The mean accuracy for the model using stratified k-fold cross-
validation is 76.96 percent.

50
Leave One Out Cross-Validation (LOOCV)
• LOOCV is the cross-validation technique in which the size of the fold
is “1” with “k” being set to the number of observations in the data.
This variation is useful when the training data is of limited size and
the number of parameters to be tested is not high.

loocv = model_selection.LeaveOneOut()
model_loocv = LogisticRegression()
results_loocv = model_selection.cross_val_score(model_loocv, x1, y1,
cv=loocv)
print("Accuracy: %.2f%%" % (results_loocv.mean()*100.0))
51
Leave One Out Cross-Validation (LOOCV)
Output
Accuracy: 76.82%

• The mean accuracy for the model using the leave-one-out cross-
validation is 76.82 percent.

52
Repeated Random Test-Train Splits
• This technique is a hybrid of traditional train-test splitting and the k-
fold cross-validation method.
• In this technique, we create random splits of the data in the training-
test set manner and then repeat the process of splitting and
evaluating the algorithm multiple times, just like the cross-validation
method.

53
Repeated Random Test-Train Splits
>> kfold2 = model_selection.ShuffleSplit(n_splits=10, test_size=0.30,
random_state=100)
>> model_shufflecv = LogisticRegression()
>> results_4 = model_selection.cross_val_score(model_shufflecv, x1,
y1, cv=kfold2)
>> print("Accuracy: %.2f%% (%.2f%%)" % (results_4.mean()*100.0, >>
results_4.std()*100.0))

54
Repeated Random Test-Train Splits
Output
Accuracy: 74.76% (2.52%)

• The mean accuracy for the model using the repeated random train-
test split method is 74.76 percent.

55
Some of cross validation for classification report, confusion
metrics, individual split report etc

Information Technology For Management Driving Digital Transformation To Increase Local and Global Performance Growth and Sustainability 12th Edition Turban All Chapter Instant Download
80% (5)
Information Technology For Management Driving Digital Transformation To Increase Local and Global Performance Growth and Sustainability 12th Edition Turban All Chapter Instant Download
63 pages
Cross Validation - Notes
No ratings yet
Cross Validation - Notes
10 pages
Cross Validation
No ratings yet
Cross Validation
5 pages
Project 03: Data Fitting Applied Mathematics and Statistics For Information Technology
No ratings yet
Project 03: Data Fitting Applied Mathematics and Statistics For Information Technology
17 pages
Cross-Validation in Machine Learning
No ratings yet
Cross-Validation in Machine Learning
18 pages
Chapter2 1 33
No ratings yet
Chapter2 1 33
18 pages
6 Model Evalution
No ratings yet
6 Model Evalution
16 pages
P-2.1.2 Cross Validation and Regularization
No ratings yet
P-2.1.2 Cross Validation and Regularization
37 pages
All Types of Cross Validation
No ratings yet
All Types of Cross Validation
9 pages
Research Trends in Machine Learning: Muhammad Kashif Hanif
No ratings yet
Research Trends in Machine Learning: Muhammad Kashif Hanif
20 pages
ADS
No ratings yet
ADS
20 pages
UNIT4 Cross Validation
No ratings yet
UNIT4 Cross Validation
16 pages
3.1. Cross-Validation - Evaluating Estimator Performance - Scikit-Learn 1.3.0 Documentation
No ratings yet
3.1. Cross-Validation - Evaluating Estimator Performance - Scikit-Learn 1.3.0 Documentation
12 pages
Cross Validation in ML
No ratings yet
Cross Validation in ML
5 pages
3. Cross Validation
No ratings yet
3. Cross Validation
16 pages
CH 05 Optimization Technique
No ratings yet
CH 05 Optimization Technique
58 pages
Lecture Slide 02 - Supervised Learning - Summer 2023
No ratings yet
Lecture Slide 02 - Supervised Learning - Summer 2023
43 pages
Unit 2
No ratings yet
Unit 2
28 pages
cross validation
No ratings yet
cross validation
5 pages
14 Model Selection and Boosting
No ratings yet
14 Model Selection and Boosting
51 pages
1st PGM
No ratings yet
1st PGM
10 pages
ML in Python Part-2
No ratings yet
ML in Python Part-2
21 pages
Cross Validation Thesis
100% (4)
Cross Validation Thesis
5 pages
Lecture Note #6_PEC-CS701E
No ratings yet
Lecture Note #6_PEC-CS701E
11 pages
Module 3 - ML
No ratings yet
Module 3 - ML
101 pages
S-10
No ratings yet
S-10
11 pages
ML.1Lecture.2 (Old)
No ratings yet
ML.1Lecture.2 (Old)
23 pages
EMBED LEC MIDTERM REVIEWER
No ratings yet
EMBED LEC MIDTERM REVIEWER
14 pages
L2 Supervised Learning
No ratings yet
L2 Supervised Learning
43 pages
Unit 5 New
No ratings yet
Unit 5 New
9 pages
04 - Model Selection
No ratings yet
04 - Model Selection
62 pages
sklearn
No ratings yet
sklearn
141 pages
Cofusion Matrix Cross- Validation
No ratings yet
Cofusion Matrix Cross- Validation
34 pages
MIS410 Lecture8toLecture10
No ratings yet
MIS410 Lecture8toLecture10
13 pages
Module 6_ML
No ratings yet
Module 6_ML
30 pages
Data Mining Practicals
No ratings yet
Data Mining Practicals
22 pages
Guide
No ratings yet
Guide
24 pages
T1 ML QB Soln
No ratings yet
T1 ML QB Soln
23 pages
INSY446 - 02 - Linear Model Part 1
No ratings yet
INSY446 - 02 - Linear Model Part 1
27 pages
Practical Issues
No ratings yet
Practical Issues
30 pages
Cross-Validation in Machine Learning - Javatpoint
No ratings yet
Cross-Validation in Machine Learning - Javatpoint
8 pages
ML m5_2
No ratings yet
ML m5_2
24 pages
Prathamesh KRAI
No ratings yet
Prathamesh KRAI
38 pages
Cross Validation
No ratings yet
Cross Validation
4 pages
Train Test Split in Python
No ratings yet
Train Test Split in Python
11 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
Module3-Ensemble Learning
No ratings yet
Module3-Ensemble Learning
107 pages
Answer-4 Shreyansh
No ratings yet
Answer-4 Shreyansh
4 pages
INSY662 - F23 - Week 3-1
No ratings yet
INSY662 - F23 - Week 3-1
22 pages
L03 Generalization, Train Test Splits and Validation
No ratings yet
L03 Generalization, Train Test Splits and Validation
49 pages
Cross-Validation and Model Selection
No ratings yet
Cross-Validation and Model Selection
46 pages
Exam2Review
No ratings yet
Exam2Review
23 pages
Lecture 5 Evaluation_Classifer
No ratings yet
Lecture 5 Evaluation_Classifer
61 pages
Choosing Model and Tuning
No ratings yet
Choosing Model and Tuning
20 pages
Comparison Between Performance of Classifiers
No ratings yet
Comparison Between Performance of Classifiers
5 pages
Unit V
No ratings yet
Unit V
12 pages
ISYE 6501 Georgia Tech hmwk3.1b
No ratings yet
ISYE 6501 Georgia Tech hmwk3.1b
5 pages
ML W8 Merged
No ratings yet
ML W8 Merged
27 pages
Machine File
No ratings yet
Machine File
27 pages
Cross Validation: Chandan B K Mrs. S Asst Professor, Department of Computer Science Engineering
No ratings yet
Cross Validation: Chandan B K Mrs. S Asst Professor, Department of Computer Science Engineering
21 pages
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
From Everand
Acceptance-Rejection Sampling and Multi-dimensional Monte Carlo Integrations Utilizing Mathematica®
SUJAUL CHOWDHURY
No ratings yet
python-final exam
No ratings yet
python-final exam
2 pages
1
No ratings yet
1
32 pages
SQL
No ratings yet
SQL
1 page
PROFORMA INVOICE LIFT (HIGHWAY TRADERS LHR)
No ratings yet
PROFORMA INVOICE LIFT (HIGHWAY TRADERS LHR)
9 pages
Jamia Tul Madina Faizan (1)
No ratings yet
Jamia Tul Madina Faizan (1)
6 pages
CH SHM, Waves & Sound
No ratings yet
CH SHM, Waves & Sound
2 pages
Data for Gratuity Valuation - June 30 2021 v1
No ratings yet
Data for Gratuity Valuation - June 30 2021 v1
27 pages
Chemistry Blanks
No ratings yet
Chemistry Blanks
15 pages
DUHS Strategic Plan
No ratings yet
DUHS Strategic Plan
55 pages
Writing Approaches
No ratings yet
Writing Approaches
3 pages
Haste Makes Waste Hurry Makes Curry
No ratings yet
Haste Makes Waste Hurry Makes Curry
1 page
Dogar AMC Book Biology Portion (Taleem360)
No ratings yet
Dogar AMC Book Biology Portion (Taleem360)
49 pages
Meer Taqi Meer
No ratings yet
Meer Taqi Meer
4 pages
CARBOHYDRATEANKI.CSV
No ratings yet
CARBOHYDRATEANKI.CSV
2 pages
Preparation Paper XII Biology 2023
No ratings yet
Preparation Paper XII Biology 2023
9 pages
GRIP (BIOLOGY) 2021 PMC NMDCAT NUMS AGHA KHAN 12000+ MCQS Question Bank
No ratings yet
GRIP (BIOLOGY) 2021 PMC NMDCAT NUMS AGHA KHAN 12000+ MCQS Question Bank
103 pages
Cell Cycle PDF
No ratings yet
Cell Cycle PDF
12 pages
Akhuwat Internship Programme
No ratings yet
Akhuwat Internship Programme
2 pages
Guess Paper XI Zoology 2022
No ratings yet
Guess Paper XI Zoology 2022
3 pages
Result Chem GT (CH # 2, 5) MDCAT
No ratings yet
Result Chem GT (CH # 2, 5) MDCAT
1 page
Chapter 9 Biotechnology
No ratings yet
Chapter 9 Biotechnology
21 pages
Digital Learning
No ratings yet
Digital Learning
2 pages
Practical Examination: Intermediate FOR 2018
No ratings yet
Practical Examination: Intermediate FOR 2018
6 pages
150 MCQs
No ratings yet
150 MCQs
13 pages
Flood in Pakistan
No ratings yet
Flood in Pakistan
2 pages
Essays 2022
No ratings yet
Essays 2022
7 pages
Chapter 6 Chromosomes and DNA
No ratings yet
Chapter 6 Chromosomes and DNA
20 pages
Introduction To Neuro Fuzzy and Soft Computing
No ratings yet
Introduction To Neuro Fuzzy and Soft Computing
11 pages
Vip90 - Tailieudacbietdocdien Dochieuchudechatgpt
No ratings yet
Vip90 - Tailieudacbietdocdien Dochieuchudechatgpt
22 pages
191AIE021J-Recommender Systems-Syllabus
No ratings yet
191AIE021J-Recommender Systems-Syllabus
4 pages
1733333675759_curriculum Vitae for Vinita1-2
No ratings yet
1733333675759_curriculum Vitae for Vinita1-2
4 pages
78 - Rutuja Surve - AISC - Exp1
No ratings yet
78 - Rutuja Surve - AISC - Exp1
5 pages
Deep Learning 117 MCQ
No ratings yet
Deep Learning 117 MCQ
33 pages
Explainable AI For Earth Observation A R - 2022 - International Journal of Appl
No ratings yet
Explainable AI For Earth Observation A R - 2022 - International Journal of Appl
11 pages
8th Sem Syllabus (CSIT) TU
No ratings yet
8th Sem Syllabus (CSIT) TU
25 pages
Samsara Dash Cam Review
No ratings yet
Samsara Dash Cam Review
17 pages
Opportunities and Challenges in Developing Deep Learning Models Using Electronic Health Records Data: A Systematic Review
No ratings yet
Opportunities and Challenges in Developing Deep Learning Models Using Electronic Health Records Data: A Systematic Review
10 pages
Week1_Lecture2
No ratings yet
Week1_Lecture2
50 pages
LLVIP A Visible-Infrared Paired Dataset For Low-Light Vision
No ratings yet
LLVIP A Visible-Infrared Paired Dataset For Low-Light Vision
9 pages
Soil Scout Autonomous Soil Quality Assessment
No ratings yet
Soil Scout Autonomous Soil Quality Assessment
10 pages
CS6ML Assignment1
No ratings yet
CS6ML Assignment1
4 pages
AUCR2021
No ratings yet
AUCR2021
25 pages
Amazon Com Inc 2023 Annual Report
No ratings yet
Amazon Com Inc 2023 Annual Report
92 pages
PDF 10
No ratings yet
PDF 10
17 pages
Report
No ratings yet
Report
3 pages
Prompt Engineering
100% (4)
Prompt Engineering
100 pages
Research Proposal Format EduplusCampus
No ratings yet
Research Proposal Format EduplusCampus
15 pages
6 Report
No ratings yet
6 Report
10 pages
Bioethics in the context of Artificial Intelligence
No ratings yet
Bioethics in the context of Artificial Intelligence
2 pages
LinkedIn Profile of Deepak Samar For DH SVP Role
No ratings yet
LinkedIn Profile of Deepak Samar For DH SVP Role
4 pages
Assignment - 2023 - Week - 2-With Solution PDF
No ratings yet
Assignment - 2023 - Week - 2-With Solution PDF
5 pages
GROUP-4-RESEARCH (1)
No ratings yet
GROUP-4-RESEARCH (1)
38 pages
Prelim Lec Exam
No ratings yet
Prelim Lec Exam
14 pages
Deep Dive into Power Automate: Learn by Example 1st Edition Mishra download pdf
100% (1)
Deep Dive into Power Automate: Learn by Example 1st Edition Mishra download pdf
29 pages
7 best ChatGPT prompts for lawyers in 2024
No ratings yet
7 best ChatGPT prompts for lawyers in 2024
1 page
Ultimate Python Guide (2024)
100% (1)
Ultimate Python Guide (2024)
715 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

8

Uploaded by

8

Uploaded by

Machine Learning

Muhammad Affan Alim

• Some of the data is removed before training begins.

• This is the basic idea for a whole class of model evaluation

• Helps selecting the best fit model

• As before the average error is computed and used to evaluate the

• Normally use in biometrics recognition

>> def classification_report_with_accuracy_score(y_true, y_pred):

# X -> features, y -> label

>> # training a linear SVM classifier

>> # creating a confusion matrix

• Model validation helps in ensuring that the model performs well on

• The following sections will cover these steps.

# Import required libraries

• Once the process is completed, we can summarize the evaluation

results_kfold = model_selection.cross_val_score(model_kfold, x1, y1,

print("Accuracy: %.2f%%" % (results_kfold.mean()*100.0))

• The mean accuracy for the model using k-fold cross-validation is

skfold = StratifiedKFold(n_splits=3, random_state=100)

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.