0% found this document useful (0 votes)
143 views36 pages

FML File Final

This document is a practical file submitted by 5 students to partially fulfill requirements for an evaluation in the "Fundamentals of Machine Learning-Lab" course. It contains 10 experiments covering various machine learning algorithms including linear regression, logistic regression, KNN, SVM, random forests, naive Bayes, decision trees, K-means clustering, Gaussian mixture models, and association rule classification. For each experiment, there is an abstract, code implementation, output and learning outcomes section.

Uploaded by

Kunal Saini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views36 pages

FML File Final

This document is a practical file submitted by 5 students to partially fulfill requirements for an evaluation in the "Fundamentals of Machine Learning-Lab" course. It contains 10 experiments covering various machine learning algorithms including linear regression, logistic regression, KNN, SVM, random forests, naive Bayes, decision trees, K-means clustering, Gaussian mixture models, and association rule classification. For each experiment, there is an abstract, code implementation, output and learning outcomes section.

Uploaded by

Kunal Saini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

FML_LAB FILE

Practical File
Submitted in partial fulfillment for the evaluation of

“Fundamentals of Machine
Learning-Lab”

Submitted By:
NAME: Kunal Saini, Aditya Jain, Aditi Jain, Ananya Tyagi, Aditya Vikram
ENROLL NO: 07317711621, 07617711621, 08217711621, 08817711621, 09117711621
BRANCH & SECTION: AI & ML-B

Submitted To:
 Dr. Sonakshi Vij

1
FML_LAB FILE

INDEX
S.No Details Page Date Grade/Evaluation Sign
No.
1 Study and implement
Linear Regression

2 Study and implement


Logistic Regression

3 Study and implement K


nearest neighbour
(KNN)

4 Study and implement


Classification using
SVM

5 Study and implement


Bagging using Random
Forests

6 Study and implement


Naïve Bayes

7 Study and implement


Decision Trees

8 Study and implement


K-means Clustering to
find Natural Patterns in
data
9 Study and implement
Gaussian Mixture
Model using the
Expectation
Maximization
10 Study and implement
Classification based on
association rules

2
FML_LAB FILE

3
FML_LAB FILE

EXPERIMRNT-1
(LINEAR REGRESSION)
ABSTRACT:

4
FML_LAB FILE
CODE:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Load the dataset


data = pd.read_csv('data.csv')

# Select the 'perimeter_mean' and 'concavity_mean' columns for linear regression


X = data[['perimeter_mean']] # Predictor variable
y = data['concavity_mean'] # Target variable

# Create a linear regression model


model = LinearRegression()

# Fit the model to the data


model.fit(X, y)

# Perform predictions
y_pred = model.predict(X)

# Calculate the mean squared error


mse = mean_squared_error(y, y_pred)
print("Mean Squared Error:", mse)

# Plot the linear regression line


plt.scatter(X, y, color='blue', label='Data')
plt.plot(X, y_pred, color='red', label='Linear Regression')

5
FML_LAB FILE
plt.xlabel('Perimeter_mean')
plt.ylabel('Concavity_mean')
plt.title('Linear Regression')
plt.legend()
plt.show()

OUTPUT:

LEARNING OUTCOMES:

6
FML_LAB FILE

EXPERIMENT-2
(LOGISTIC REGRESSION)
ABSTRACT:

7
FML_LAB FILE
CODE:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.model_selection import train_test_split

# Load the dataset


data = pd.read_csv('data.csv')

# Select the 'perimeter_mean' and 'diagnosis' columns for logistic regression


X = data[['perimeter_mean']] # Predictor variable
y = data['diagnosis'] # Target variable

# Reshape feature to a 1-dimensional array


X = np.array(X).reshape(-1, 1)

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a logistic regression model


model = LogisticRegression()

# Fit the model to the training data


model.fit(X_train, y_train)

# Perform predictions on the test data


y_pred = model.predict(X_test)

8
FML_LAB FILE
# Calculate the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

# Create a confusion matrix


confusion_mat = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(confusion_mat)

# Plot the logistic regression curve


X_range = np.linspace(X.min(), X.max(), 100).reshape(-1, 1)
y_proba = model.predict_proba(X_range)[:, 1]
plt.scatter(X, y, color='blue', label='Data')
plt.plot(X_range, y_proba, color='red', label='Logistic Regression')
plt.xlabel('perimeter_mean')
plt.ylabel('diagnosis')
plt.title('Logistic Regression')
plt.legend()
plt.show()

OUTPUT:

LEARNING OUTCOMES:

9
FML_LAB FILE

EXPERIMENT-3
(K NEAREST NEIGHBOR)
ABSTRACT:

10
FML_LAB FILE
CODE:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, confusion_matrix

# Load the dataset


data = pd.read_csv('data.csv')

# Select the relevant features and target variable


X = data[['radius_mean', 'perimeter_mean', 'concavity_mean', 'smoothness_mean',
'texture_mean']]
y = data['id']

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a KNN classifier with k=5


knn = KNeighborsClassifier(n_neighbors=5)

# Fit the model to the training data


knn.fit(X_train, y_train)

# Perform predictions on the test data


y_pred = knn.predict(X_test)

# Calculate the accuracy of the model


accuracy = accuracy_score(y_test, y_pred)

11
FML_LAB FILE
print("Accuracy:", accuracy)

# Create a confusion matrix


confusion_mat = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(confusion_mat)

# Scatter plot
plt.scatter(data['radius_mean'], data['concavity_mean'], c=data['id'], cmap='coolwarm')
plt.xlabel('Radius_mean')
plt.ylabel('Concavity_mean')
plt.title('KNN')
plt.colorbar(label='Id')
plt.show()

OUTPUT:

LEARNING OUTCOMES:

12
FML_LAB FILE

EXPERIMENT-4
(CLASSIFICATION USING SVM)
ABSTRACT:

13
FML_LAB FILE
CODE:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.svm import SVR
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import r2_score

# Load the CSV file into a pandas dataframe


df = pd.read_csv('data.csv')

# Extract the relevant features


X = df[['perimeter_mean']]
y = df['concavity_mean']

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale the data


scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Train the SVM model


svm_regressor = SVR(kernel='rbf')
svm_regressor.fit(X_train_scaled, y_train)

# Make predictions on the test data


y_pred = svm_regressor.predict(X_test_scaled)

14
FML_LAB FILE
# Calculate the coefficient of determination (R^2) on the test data
r2 = r2_score(y_test, y_pred)
print('R^2 score:', r2)

# Plot the predicted values and the actual values on the test data
plt.figure(figsize=(10, 5))
plt.scatter(X_test, y_test, color='red')
plt.plot(X_test, y_pred, color='blue', linewidth=3)
plt.xlabel('perimeter_mean')
plt.ylabel('concavity_mean')
plt.title('Support Vector Regression')
plt.show()

OUTPUT:

LEARNING OUTCOMES:

15
FML_LAB FILE

EXPERIMENT-5
(BAGGING USING RANDOM FORESTS)
ABSTRACT:

16
FML_LAB FILE
CODE:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix

# Load the dataset


data = pd.read_csv('data.csv')

# Select the features and target variable


X=data[['perimeter_mean','area_mean','concavity_mean','radius_mean','texture_mean','smoot
hness_mean']]
#X = X.dropna()
y = data['diagnosis']

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Random Forest classifier with 100 trees


rf = RandomForestClassifier(n_estimators=100, random_state=42)

# Fit the model to the training data


rf.fit(X_train, y_train)

# Perform predictions on the test data


y_pred = rf.predict(X_test)
# Calculate the accuracy of the model

17
FML_LAB FILE
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

# Create a confusion matrix


confusion_mat = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(confusion_mat)

# Feature Importance
feature_importance = rf.feature_importances_
feature_names = X.columns

# Sort feature importance in descending order


sorted_indices = np.argsort(feature_importance)[::-1]
sorted_feature_names = feature_names[sorted_indices]
sorted_feature_importance = feature_importance[sorted_indices]

# Bar plot of feature importance


plt.figure(figsize=(10, 6))
sns.barplot(x=sorted_feature_importance, y=sorted_feature_names)
plt.xlabel('Feature Importance')
plt.ylabel('Features')
plt.title('Random Forest')
plt.show()

18
FML_LAB FILE
OUTPUT:

LEARNING OUTCOMES:

19
FML_LAB FILE

EXPERIEMNT-6
(NAIVE BAYES)
ABSTRACT:

20
FML_LAB FILE
CODE:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import confusion_matrix, classification_report

# Load the dataset


data = pd.read_csv("data.csv")

# Split the dataset into features (X) and target variable (y)
X=data[['perimeter_mean','area_mean','concavity_mean','radius_mean','texture_mean','smoot
hness_mean']]
#X = X.dropna()
y = data['diagnosis']

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Gaussian Naive Bayes model


naive_bayes = GaussianNB()

# Train the model


naive_bayes.fit(X_train, y_train)

# Make predictions on the test set


y_pred = naive_bayes.predict(X_test)

# Create a confusion matrix

21
FML_LAB FILE
cm = confusion_matrix(y_test, y_pred)
# Plot the confusion matrix plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, cmap="Blues", fmt="d",
cbar=False) plt.title("Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()

# Create a classification report


report = classification_report(y_test, y_pred)

# Print the classification report


print("Classification Report:")
print(report)

OUTPUT:

LEARNING OUTCOMES:

22
FML_LAB FILE

EXPERIMENT-7
(DECISION TREES)
ABSTRACT:

23
FML_LAB FILE
CODE:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import confusion_matrix

# Load the dataset


data = pd.read_csv("data.csv")

# Split the dataset into features (X) and target variable (y)
X=data[['perimeter_mean','area_mean','concavity_mean','radius_mean','texture_mean','smoot
hness_mean']]
#X = X.dropna()
y = data['diagnosis']

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Decision Tree classifier


decision_tree = DecisionTreeClassifier()

# Train the model


decision_tree.fit(X_train, y_train)

# Plot the decision tree


plt.figure(figsize=(12, 8))
plot_tree(decision_tree, feature_names=X.columns, class_names=['0','1'], filled=True)

24
FML_LAB FILE
plt.title("Decision Tree")
plt.show()

# Make predictions on the test set


y_pred = decision_tree.predict(X_test)

# Create a confusion matrix


cm = confusion_matrix(y_test, y_pred)

# Plot the confusion matrix


plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, cmap="Blues", fmt="d",
cbar=False) plt.title("Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()

# Generate the classification report


report = classification_report(y_test, y_pred)
print("Classification Report:")
print(report)

25
FML_LAB FILE
OUTPUT:

LEARNING OUTCOMES:

26
FML_LAB FILE

EXPERIMENT-8
(K MEANS CLUSTERING)
ABSTRACT:

27
FML_LAB FILE
CODE:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

# Load the dataset


data = pd.read_csv('data.csv')

# Extract the features


X = data[['concavity_mean', 'perimeter_mean']].values

# Number of clusters (k)


k=3

# Create a KMeans instance


kmeans = KMeans(n_clusters=k)

# Fit the model to the data


kmeans.fit(X)

# Obtain the cluster labels for the data


labels = kmeans.labels_

# Obtain the cluster centers


centers = kmeans.cluster_centers_

# Print the cluster labels and centers


print("Cluster Labels:")

28
FML_LAB FILE
print(labels)
print("Cluster Centers:")
print(centers)

# Scatter plot of the data points with cluster assignments and centers
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(centers[:, 0], centers[:, 1], marker='X', color='red', label='Centers')
plt.xlabel('Concavity_mean')
plt.ylabel('Perimeter_mean')
plt.title('K-means Clustering')
plt.legend()
plt.show()

OUTPUT:

LEARNING OUTCOMES:

29
FML_LAB FILE

EXPERIMENT-9
(GAUSSIAN MIXTURE MODEL)
ABSTRACT:

30
FML_LAB FILE
CODE:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.mixture import GaussianMixture

# Load the dataset


data = pd.read_csv('data.csv')

# Extract the features


X = data[['concavity_mean', 'area_mean']].values

# Number of clusters/components (k)


k=3

# Create a Gaussian Mixture Model instance


gmm = GaussianMixture(n_components=k)

# Fit the model to the data


gmm.fit(X)

# Obtain the predicted labels for the data


labels = gmm.predict(X)

# Get the probabilities of each data point belonging to each cluster


probs = gmm.predict_proba(X)

# Print the cluster labels and probabilities


print("Cluster Labels:")

31
FML_LAB FILE
print(labels)
print("Probabilities:")
print(probs)

# Scatter plot of the data points with cluster assignments


plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.xlabel('Concavity_mean')
plt.ylabel('Area_mean')
plt.title('Gaussian Mixture Model Clustering')
plt.show()

OUTPUT:

LEARNING OUTCOMES:

32
FML_LAB FILE

EXPERIMENT-10
(CLASSIFICATION BASED ON ASSOCIATION RULES)
ABSTRACT:

33
FML_LAB FILE
CODE:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix

# Load the dataset


data = pd.read_csv('data.csv')

# Select the features and target variable


X = data[['radius_mean', 'perimeter_mean', 'area_mean']]
y = data['diagnosis']

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a decision tree classifier


clf = DecisionTreeClassifier()

# Fit the model to the training data


clf.fit(X_train, y_train)

# Perform predictions on the test data


y_pred = clf.predict(X_test)

# Calculate the accuracy of the model


accuracy = accuracy_score(y_test, y_pred)

34
FML_LAB FILE
print("Accuracy:", accuracy)

# Create a confusion matrix


confusion_mat = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(confusion_mat)

OUTPUT:

LEARNING OUTCOMES:

35
FML_LAB FILE

36

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy