Data Mining and Warehousing Concepts Lab: (ITPC - 228)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Data mining and warehousing concepts Lab

(ITPC - 228)
(Assignment - 6)

AIM(6.a) - Python Implementation of KNeighborsClassifier


Algorithm using iris dataset.

STEPS -

1. Load the Iris dataset using scikit-learn's load_iris function. This


dataset contains information about iris flowers, including sepal and petal
measurements.
2. Split the dataset into features (X) and target labels (y). The features
are the measurements, and the target labels are the species of iris flowers.
3. Split the dataset into training and test sets using train_test_split from
sklearn.model_selection. This step is crucial for evaluating the performance
of the KNN classifier.
4. Create a KNN classifier using KNeighborsClassifier from
sklearn.neighbors. Specify the number of neighbors (k) to consider in the
classification.
5. Train the KNN classifier on the training set using the fit method. This
step involves storing the training data to use it for making predictions later
6. Make predictions on the test set using the predict method of the KNN
classifier. This step calculates the nearest neighbors of each test instance
and assigns the most common class label among those neighbors as the
predicted label.
7. Evaluate the classifier by comparing the predicted labels with the
actual labels in the test set. Calculate the accuracy of the classifier using
accuracy_score from sklearn.metrics.
Source code :

from sklearn.datasets import load_iris


from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Load the Iris dataset


dataset = load_iris()
X = dataset.data
y = dataset.target

# Split the dataset into training and test sets


X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)

# Create a KNN classifier


k = 3
knn = KNeighborsClassifier(n_neighbors=k)

# Train the classifier on the training set


knn.fit(X_train, y_train)

# Make predictions on the test set


y_pred = knn.predict(X_test)

# Calculate the accuracy of the classifier


accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")

# Sample prediction
sample_data = [[5.1, 3.0, 1.4, 0.2]]
sample_pred = knn.predict(sample_data)
print(f"Predicted class for sample data {sample_data}:
{dataset.target_names[sample_pred[0]]}")

Output :
AIM(6.b) - Python Implementation of Naive Bayes Classifier
Algorithm using iris dataset.

STEPS -

1. Load the Iris dataset: Use load_iris() from sklearn.datasets to load


the Iris dataset.
2. Split the dataset: Use train_test_split() from
sklearn.model_selection to split the dataset into training and test
sets.
3. Create the classifier: Use GaussianNB() from sklearn.naive_bayes
to create a Gaussian Naive Bayes classifier.
4. Train the classifier: Use the fit() method to train the classifier on the
training set (X_train, y_train).
5. Make predictions: Use the predict() method to make predictions on
the test set (X_test).
6. Calculate accuracy: Use accuracy_score() from sklearn.metrics to
calculate the accuracy of the classifier.
7. Sample prediction: Create a sample data point and use the
classifier to predict its class.

Source code :

from sklearn.datasets import load_iris


from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# Load the Iris dataset


iris = load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and test sets


X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)

# Create a Gaussian Naive Bayes classifier


nb = GaussianNB()

# Train the classifier on the training set


nb.fit(X_train, y_train)

# Make predictions on the test set


y_pred = nb.predict(X_test)
# Calculate the accuracy of the classifier
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")

# Sample prediction
sample_data = [[4.0, 3.5, 1.4, 0.2]]
sample_pred = nb.predict(sample_data)
print(f"Predicted class for sample data{sample_data}:
{iris.target_names[sample_pred[0]]}")

Output :
Data mining and warehousing concepts Lab
(ITPC - 228)
(Assignment - 7)

AIM - Python Implementation of Decision Tree using iris dataset.

Source code :
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt

# Load the Iris dataset


iris = load_iris()
X = iris.data
y = iris.target

# Split the dataset into training and test sets


X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)

# Create a Decision Tree classifier


dt = DecisionTreeClassifier()

# Train the classifier on the training set


dt.fit(X_train, y_train)

# Make predictions on the test set


y_pred = dt.predict(X_test)

# Calculate the accuracy of the classifier


accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")

# Visualize the decision tree


plt.figure(figsize=(12, 8))
plot_tree(dt, filled=True, feature_names=iris.feature_names,
class_names=iris.target_names)
plt.show()
Output :

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy