SPPUML5
SPPUML5
Roll No : 2441059
Batch : D
Assignment No.05 : Implement K-Nearest Neighbors algorithm on diabetes.csv dataset. Compute confusion matrix,
accuracy, error rate, precision and recall on the given dataset. Dataset link :
https://www.kaggle.com/datasets/abdallamahgoub/diabetes
(https://www.kaggle.com/datasets/abdallamahgoub/diabetes)
In [2]: df = pd.read_csv('diabetes.csv')
In [3]: df.head()
Out[3]:
Pregnancies Glucose BloodPressure SkinThickness Insulin BMI Pedigree Age Outcome
1 1 85 66 29 0 26.6 0.351 31 0
3 1 89 66 23 94 28.1 0.167 21 0
In [4]: df.tail()
Out[4]:
Pregnancies Glucose BloodPressure SkinThickness Insulin BMI Pedigree Age Outcome
In [5]: df.isnull().sum()
Out[5]: Pregnancies 0
Glucose 0
BloodPressure 0
SkinThickness 0
Insulin 0
BMI 0
Pedigree 0
Age 0
Outcome 0
dtype: int64
localhost:8888/notebooks/ML/ML-5.ipynb 1/4
14/08/2024, 15:56 ML-5 - Jupyter Notebook
In [7]: X
Out[7]:
Pregnancies Glucose BloodPressure SkinThickness Insulin BMI Pedigree Age
1 1 85 66 29 0 26.6 0.351 31
3 1 89 66 23 94 28.1 0.167 21
In [10]: X_train
In [11]: k = 3
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)
Out[11]: KNeighborsClassifier(n_neighbors=3)
localhost:8888/notebooks/ML/ML-5.ipynb 2/4
14/08/2024, 15:56 ML-5 - Jupyter Notebook
/home/comp/anaconda3/lib/python3.9/site-packages/sklearn/neighbors/_classification.py:2
28: FutureWarning: Unlike other reduction functions (e.g. `skew`, `kurtosis`), the defa
ult behavior of `mode` typically preserves the axis it acts along. In SciPy 1.11.0, thi
s behavior will change: the default value of `keepdims` will become False, the `axis` o
ver which the statistic is taken will be eliminated, and the value None will no longer
be accepted. Set `keepdims` to True or False to avoid this warning.
mode, _ = stats.mode(_y[neigh_ind, k], axis=1)
Out[12]: array([0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0,
0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0,
0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1,
0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0,
0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1,
0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1,
0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0])
print("Confusion Matrix:")
print(conf_matrix)
print("Accuracy:", accuracy)
print("Error Rate:", error_rate)
print("Precision:", precision)
print("Recall:", recall)
Confusion Matrix:
[[81 18]
[27 28]]
Accuracy: 0.7077922077922078
Error Rate: 0.29220779220779225
Precision: 0.6086956521739131
Recall: 0.509090909090909
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
In [16]: sc = metrics.accuracy_score(y_test,y_pred)
print("SVM accuracy: ", sc)
In [19]: lr = LogisticRegression()
lr.fit(X_train, y_train)
y_pred = lr.predict(X_test)
localhost:8888/notebooks/ML/ML-5.ipynb 3/4
14/08/2024, 15:56 ML-5 - Jupyter Notebook
In [ ]:
localhost:8888/notebooks/ML/ML-5.ipynb 4/4