0% found this document useful (0 votes)
2 views19 pages

ML Record

The document outlines the structure and content of the Machine Learning Laboratory course at Hindusthan College of Engineering and Technology for the academic year 2024-2025. It includes various experiments related to Python libraries, algorithms like Find-S, SVM, Decision Trees, clustering methods, and k-Nearest Neighbors. Each experiment contains sections for aims, algorithms, and sample programs demonstrating the implementation of machine learning concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views19 pages

ML Record

The document outlines the structure and content of the Machine Learning Laboratory course at Hindusthan College of Engineering and Technology for the academic year 2024-2025. It includes various experiments related to Python libraries, algorithms like Find-S, SVM, Decision Trees, clustering methods, and k-Nearest Neighbors. Each experiment contains sections for aims, algorithms, and sample programs demonstrating the implementation of machine learning concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

HINDUSTHAN

COLLEGE OF ENGINEERING
AND TECHNOLOGY
(AUTONOMOUS INSTITUTION)
Coimbatore - 641032

22CS5252/MACHINE LEARNING LABORATORY

REG.NO :

NAME :

COURSE :

YEAR/SEM:
HINDUSTHAN
COLLEGE OF ENGINEERING AND TECHNOLOGY
(AUTONOMOUS INSTITUTION)
Coimbatore-641032.
DEPARTMENT OF COMPUTER
DEPARTMENT SCIENCE
OF COMPUTER SCIENCE AND ENGINEERING
AND ENGINEERING

Certified that this is the bonafide record of work done by

In the 22CS5252 / MACHINE LEARNING LABORATORY of this autonomous


institution, for the FIFTH Semester during the Academic year 2024 –2025.

Place : Coimbatore
Date:

Staff In-charge Head of the Department

Register Number:
Submitted for the 22CS5252 / MACHINE LEARNING LABORATORY practical examination
conducted on .

INTERNALEXAMINER EXTERNALEXAMINER
CONTENTS

PAGE
S.NO DATE EXPERIMENT MARKS SIGN
NO
Implementation of Basic Python Libraries
1 a)
(Math, Numpy, Scipy)
Implementation of Python Libraries for
1 b)
Machine Learning Applications (Pandas,
Matplotlib)
1 c) Creation and Loading of Datasets

2 Find-S Algorithm for Hypothesis Selection

Support Vector Machine (SVM) Decision


3
Boundary

4 Decision Tree Classification using ID3


Algorithm
5 Clustering Using EM (GMM) and k-Means
Algorithms
k-Nearest Neighbor Classification
6

STAFFINCHARGE
Ex.No: 01 a)
Implementation of Basic Python Libraries (Math, Numpy, Scipy)
Date:

Aim:

Algorithm:

Program:

# Importing libraries
import math
import numpy as np
from scipy import integrate
from scipy import linalg

# 1. Using the Math library


print("Math Library Operations:")
print("Square root of 16:", math.sqrt(16))
print("Factorial of 5:", math.factorial(5))
print("Sin(45 degrees):", math.sin(math.radians(45)))

# 2. Using the Numpy library


Print ("\nNumpy Library Operations:")

# Creating a numpy array


array = np.array([1, 2, 3, 4, 5])
print("Original array:", array)

# Array operations
print("Array after adding 10:", array + 10)
print("Mean of array:", np.mean(array))
print("Standard deviation of array:", np.std(array))
print("Dot product of array with itself:", np.dot(array, array))

# Matrix creation and operations


matrix = np.array([[1, 2], [3, 4]])
print("\nOriginal matrix:\n", matrix)
print("Matrix transpose:\n", np.transpose(matrix))
print("Matrix determinant:", np.linalg.det(matrix))
# 3. Using the Scipy library
print("\nScipy Library Operations:")

# Integration using Scipy


result, error = integrate.quad(lambda x: x**2, 0, 1)
print("Integration of x^2 from 0 to 1:", result)

# Solving a linear system Ax = b


A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])
x = linalg.solve(A, b)
print("Solution of linear system Ax = b:", x)

Result:
Ex.No: 01 b) Implementation of Python Libraries for Machine Learning
A Applications (Pandas, Matplotlib)
Date:

Aim:

Algorithm:

Program:
# Importing libraries
import pandas as pd
import matplotlib.pyplot as plt

# 1. Using the Pandas Library


print("Pandas Library Operations:")

# Loading a dataset (using a simple dictionary for illustration)


data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
'Age': [24, 27, 22, 32, 29],
'Salary': [50000, 54000, 45000, 62000, 58000]
}

# Creating a DataFrame
df = pd.DataFrame(data)
print("Dataframe:\n", df)

# Data inspection
print("\nBasic Data Information:")
print("Data types:\n", df.dtypes)
print("Summary statistics:\n", df.describe())

# Filtering data
filtered_df = df[df['Age'] > 25]
print("\nFiltered data (Age > 25):\n", filtered_df)

# 2. Using the Matplotlib Library


print("\nMatplotlib Library Operations:")
# Plotting a bar chart for Age vs. Salary
plt.figure(figsize=(8, 5))
plt.bar(df['Name'], df['Salary'], color='skyblue')
plt.xlabel('Name')
plt.ylabel('Salary')
plt.title('Salary by Person')

plt.show()
# Plotting Age vs Salary as a scatter plot
plt.figure(figsize=(8, 5))
plt.scatter(df['Age'], df['Salary'], color='green')
plt.xlabel('Age')
plt.ylabel('Salary')
plt.title('Salary vs Age')
plt.show()

Result:
Ex.No:01 c)
Creation and Loading of Datasets
Date:

Aim:

Algorithm:

Program:

# Importing libraries
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification, load_iris
import seaborn as sns

# 1. Creating a Synthetic Dataset Using Numpy


print("Synthetic Dataset with Numpy:")
# Generating random data with Numpy
np.random.seed(0) # For reproducibility
synthetic_data = np.random.rand(10, 3) # 10 rows, 3 columns
print(synthetic_data)

# 2. Creating a Synthetic Classification Dataset Using Scikit-Learn


print("\nSynthetic Classification Dataset with Scikit-Learn:")
# Generating a classification dataset with Scikit-Learn
X, y = make_classification(n_samples=100, n_features=4, n_classes=2, random_state=0)
print("Features:\n", X[:5])
print("Labels:\n", y[:5])

# 3. Loading a Dataset from a CSV File Using Pandas


print("\nLoading Dataset from CSV:")
# Sample data in dictionary form for example purposes
sample_data = {
'A': np.random.rand(5),
'B': np.random.rand(5),
'C': np.random.rand(5)
}

# Creating a DataFrame and saving it to a CSV


df = pd.DataFrame(sample_data)
df.to_csv('sample_data.csv', index=False)

# Reading the CSV back into a DataFrame


df_loaded = pd.read_csv('sample_data.csv')
print(df_loaded)

# 4. Loading a Built-in Dataset Using Scikit-Learn


print("\nLoading Built-in Dataset with Scikit-Learn:")
# Load the Iris dataset
iris = load_iris()
iris_df = pd.DataFrame(iris.data, columns=iris.feature_names)
print(iris_df.head())

# 5. Loading a Built-in Dataset Using Seaborn


print("\nLoading Built-in Dataset with Seaborn:")
# Load the 'tips' dataset from Seaborn
tips = sns.load_dataset('tips')
print(tips.head())

Result:
Ex.No:02
Find-S Algorithm for Hypothesis Selection
Date:

Aim:

Algorithm:

Program:
Let's start by creating a sample CSV file with training data. The CSV file should contain rows with attribute
values and the class label (e.g., "Yes" or "No").

Sample CSV File (training_data.csv)

Outlook Temperature Humidity Wind Water Forcast Play


Sunny Warm Normal Strong Warm Same Yes
Sunny Warm High Strong Warm Same Yes
Rainy Cold High Strong Warm Change No
Sunny Warm High Strong Cool Change Yes

Save this data as training_data.csv.

import csv
# Step 1: Read the CSV file and extract the relevant columns
def read_csv(file_path):
data = []
with open(file_path, mode='r') as file:
reader = csv.reader(file)
for row in reader:
data.append(row)
return data

# Step 2: Find-S Algorithm Implementation


def find_s_algorithm(data):
# Step 2.1: Initialize the hypothesis
# Assuming the first positive instance is the most specific hypothesis
hypothesis = data[0][:-1] # All attributes except the last (target class)
# Step 2.2: Iterate through the training data
for row in data:

if row[-1] == 'Yes': # Only consider positive instances (class 'Yes')


for i in range(len(hypothesis)):
# Generalize the hypothesis if needed
if hypothesis[i] != row[i]:
hypothesis[i] = '?' # Use '?' to denote any value (generalized)

return hypothesis

# Step 3: Main function to run the process


def main():
file_path = r'C:\Users\Hicet\Documents\training_dataa.csv' # Path to your CSV file
data = read_csv(file_path)

# Step 3.1: Apply Find-S algorithm


hypothesis = find_s_algorithm(data)

# Step 3.2: Print the most specific hypothesis


print("Most Specific Hypothesis:", hypothesis)

if __name__ == "__main__":
main()

Output:

Result:
Ex.No:3
Support Vector Machine (SVM) Decision Boundary
Date:

Aim:

Algorithm:

Program:
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Step 1: Load the Iris dataset


iris = datasets.load_iris()
X = iris.data[:, :2] # Use only the first two features for 2D visualization (sepal length and sepal width)
y = iris.target

# Step 2: Split the dataset into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 3: Standardize the data (important for SVM to perform well)


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Step 4: Train the SVM classifier (use a linear kernel for simplicity)
svm = SVC(kernel='linear', random_state=42)
svm.fit(X_train, y_train)

# Step 5: Plot the decision boundary


# Create a mesh grid over the feature space
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 500), np.linspace(y_min, y_max, 500))
# Predict the class labels for each point in the grid
Z = svm.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

# Plot the decision boundary


plt.contourf(xx, yy, Z, alpha=0.75, cmap=plt.cm.coolwarm)

# Step 6: Plot the training points


plt.scatter(X_train[:, 0], X_train[:, 1], c=y_train, edgecolors='k', marker='o', s=100, cmap=plt.cm.coolwarm)

# Plot support vectors


plt.scatter(svm.support_vectors_[:, 0], svm.support_vectors_[:, 1], s=200, facecolors='none', edgecolors='k',
linewidth=2, label='Support Vectors')

# Step 7: Set labels and title


plt.title("SVM Decision Boundary and Support Vectors (Linear Kernel)")
plt.xlabel('Sepal Length')
plt.ylabel('Sepal Width')
plt.legend()
plt.show()

Result:
Ex.No:4
Decision Tree Classification using ID3 Algorithm
Date:

Aim:

Algorithm:

Program:
# Importing necessary libraries
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn import tree
import pandas as pd

# Define the dataset


data = {
'Weather': ['Sunny', 'Sunny', 'Overcast', 'Rainy', 'Rainy', 'Sunny', 'Overcast', 'Rainy', 'Sunny'],
'Temperature': ['Hot', 'Mild', 'Mild', 'Cool', 'Mild', 'Cool', 'Hot', 'Mild', 'Mild'],
'PlayTennis': ['No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'Yes']
}

# Convert to DataFrame
df = pd.DataFrame(data)
# Encode the categorical variables (Weather, Temperature, PlayTennis)
df_encoded = pd.get_dummies(df)

# Define features and target


X = df_encoded[['Weather_Sunny', 'Weather_Overcast', 'Weather_Rainy', 'Temperature_Hot',
'Temperature_Mild', 'Temperature_Cool']]
y = df_encoded['PlayTennis_Yes']

# Initialize and train the decision tree classifier (ID3)


clf = DecisionTreeClassifier(criterion='entropy')
clf.fit(X, y)
# Visualize the tree (fix the issue with feature_names)
tree.plot_tree(clf, filled=True, feature_names=X.columns.tolist(), class_names=['No', 'Yes'], rounded=True)

# Classify a new sample


new_sample = pd.DataFrame({

'Weather_Sunny': [1], 'Weather_Overcast': [0], 'Weather_Rainy': [0],


'Temperature_Hot': [0], 'Temperature_Mild': [1], 'Temperature_Cool': [0]
})

prediction = clf.predict(new_sample)
print("Predicted class for the new sample:" ,{prediction[0]})

Result:
Ex.No:05
Clustering Using EM (GMM) and k-Means Algorithms
Date:

Aim:

Algorithm:

Program:

import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.mixture import GaussianMixture

# Generate synthetic data with 2D features


np.random.seed(42)

# Create 3 clusters of data points with different means and covariances


data1 = np.random.randn(300, 2) + [5, 5]
data2 = np.random.randn(300, 2) + [-5, -5]
data3 = np.random.randn(300, 2) + [5, -5]

# Combine the data into a single dataset


X = np.vstack([data1, data2, data3])

# Create K-Means model with 3 clusters


kmeans = KMeans(n_clusters=3, random_state=0)
kmeans_labels = kmeans.fit_predict(X)

# Create Gaussian Mixture Model (GMM) with 3 components


gmm = GaussianMixture(n_components=3, random_state=0)
gmm_labels = gmm.fit_predict(X)

# Plotting the results


fig, axs = plt.subplots(1, 2, figsize=(12, 6))
# Plot K-Means clustering
axs[0].scatter(X[:, 0], X[:, 1], c=kmeans_labels, cmap='viridis', marker='.')
axs[0].scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=200, c='red', marker='X')
axs[0].set_title("K-Means Clustering")

# Plot GMM clustering


axs[1].scatter(X[:, 0], X[:, 1], c=gmm_labels, cmap='viridis', marker='.')
axs[1].set_title("GMM Clustering (EM)")

# Show the plots


plt.tight_layout()
plt.show()

# Optional: print cluster centers for K-Means and GMM component means
print("K-Means Cluster Centers:")
print(kmeans.cluster_centers_)

print("\nGMM Component Means:")


print(gmm.means_)

Result:
Ex.No:06
k-Nearest Neighbor Classification
Date:

Aim:

Algorithm:

Program:

import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Step 1: Load the Iris dataset


iris = load_iris()
X = iris.data # Features (sepal length, sepal width, petal length, petal width)
y = iris.target # Labels (species of iris)

# Step 2: Split the dataset into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 3: Initialize and train the k-NN classifier (k=3 in this case)
k=3
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)

# Step 4: Make predictions on the test set


y_pred = knn.predict(X_test)

# Step 5: Compare predictions with actual labels and print results


correct_predictions = []
incorrect_predictions = []

for i in range(len(y_test)):
if y_pred[i] == y_test[i]:
correct_predictions.append((X_test[i], y_test[i], y_pred[i]))
else:
incorrect_predictions.append((X_test[i], y_test[i], y_pred[i]))

# Step 6: Print correct predictions


print("Correct Predictions:")
for x, actual, predicted in correct_predictions:
print(f"Features: {x}, Actual: {iris.target_names[actual]}, Predicted: {iris.target_names[predicted]}")

# Step 7: Print incorrect predictions


print("\nIncorrect Predictions:")
for x, actual, predicted in incorrect_predictions:
print(f"Features: {x}, Actual: {iris.target_names[actual]}, Predicted: {iris.target_names[predicted]}")

# Step 8: Calculate and print the accuracy of the model


accuracy = accuracy_score(y_test, y_pred)
print(f"\nAccuracy: {accuracy * 100:.2f}%")

Result:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy