0% found this document useful (0 votes)

20 views11 pages

10.lab Activity

This document outlines a lab experiment focused on implementing K-Means clustering in Python, covering objectives such as understanding unsupervised learning, implementing the algorithm, and analyzing clustering results. It details the K-Means process, including initialization, assignment, and update steps, as well as the advantages and disadvantages of the algorithm. Additionally, the document provides Python code examples for data preprocessing, clustering, and visualizing results using Matplotlib.

Uploaded by

Azhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views11 pages

10.lab Activity

Uploaded by

Azhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Experiment No.

10
Implementation of Unsupervised Classification using K-Means
Clustering: Implementation and Analysis in Python.

Objectives:

This lab will guide you through:

 Learn the fundamentals of unsupervised learning and its application in data

clustering.
 Implement the K-Means clustering algorithm from scratch using Python and popular
libraries like NumPy and Pandas.
 Analyze the performance of the K-Means algorithm on different datasets and
evaluate the clustering results based on metrics such as within-cluster variance.
 Understand the impact of key parameters (like the number of clusters) on the quality
of clustering results.
 Gain hands-on experience in preparing and preprocessing data for clustering,
including handling missing data and scaling features.
 Visualize the clusters formed by the algorithm and interpret the findings in the
context of the given data.
Prerequisites
 Familiarity with Python programming.
 Basic understanding of Classifier performance metrics concepts.

Unsupervised Classification (Clustering)

Unsupervised classification, also known as unsupervised learning, is a type of machine learning
where the model is trained on data that does not have labeled outputs. In contrast to supervised
learning, where the data includes input-output pairs (features and labels), unsupervised learning
involves finding patterns, relationships, or groupings within the data without any predefined
labels.
The goal of unsupervised classification is to discover inherent structures or groupings in the
data, which can be used for various tasks like data summarization, anomaly detection, or feature
extraction.
Common types of unsupervised learning algorithms include:
Clustering: Grouping data points into clusters based on similarity.
Dimensionality Reduction: Reducing the number of features while retaining important
information (e.g., PCA).
K-Means Clustering
K-Means clustering is one of the most popular and widely used algorithms for clustering in
unsupervised learning. It divides a set of data points into K clusters, where K is a pre-specified
number of clusters. The goal is to minimize the variance within each cluster while maximizing
the variance between clusters.
How K-Means Works:
1. Initialization: Choose the number of clusters, K, and initialize K cluster centroids
(typically, these are selected randomly from the data points).
2. Assignment Step: Assign each data point to the closest centroid based on a distance
metric, typically Euclidean distance. This creates K groups of data points.
3. Update Step: Recalculate the centroids of the clusters by taking the mean of all the
points assigned to each cluster.
4. Repeat: Repeat steps 2 and 3 until the centroids no longer change significantly or the
algorithm converges. The clusters are then considered stable.

Key Points in K-Means Clustering:

 K value: The number of clusters (K) must be specified before the algorithm runs. This
can be chosen based on domain knowledge, trial and error, or methods like the elbow
method.
 Distance Metric: The algorithm uses a distance metric (typically Euclidean distance) to
calculate the similarity between data points and centroids.
 Convergence: The algorithm iterates until the centroids do not change much between
iterations, signaling convergence.
 Sensitive to Initialization: The outcome of K-Means can depend on the initial placement
of centroids. To address this, techniques like K-Means++ initialization are used to
improve results.

Advantages of K-Means:
 Scalability: K-Means works well for large datasets because it is computationally
efficient.
 Simplicity: The algorithm is easy to implement and understand.
 Versatility: It can be applied to many types of clustering problems (e.g., customer
segmentation, image compression).

Disadvantages:
 Choosing K: The algorithm requires you to specify the number of clusters in advance,
which may not always be obvious.
 Sensitivity to Outliers: K-Means can be affected by outliers, as they can significantly
alter the mean of a cluster.
 Non-Spherical Clusters: K-Means assumes that clusters are spherical and equally sized,
which may not always be the case in real-world data.
Clustering With K Means - Python
from sklearn.cluster import KMeans
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from matplotlib import pyplot as plt
%matplotlib inline

df = pd.read_csv("income.csv")
df.head()

plt.scatter(df.Age,df['Income($)'])
plt.xlabel('Age')
plt.ylabel('Income($)')

km = KMeans(n_clusters=3)
y_predicted = km.fit_predict(df[['Age','Income($)']])
y_predicted

df['cluster']=y_predicted
df.head()

km.cluster_centers_

df1 = df[df.cluster==0]
df2 = df[df.cluster==1]
df3 = df[df.cluster==2]
plt.scatter(df1.Age,df1['Income($)'],color='green')
plt.scatter(df2.Age,df2['Income($)'],color='red')
plt.scatter(df3.Age,df3['Income($)'],color='black')
plt.scatter(km.cluster_centers_[:,0],km.cluster_centers_[:,1],color='purple',m
arker='*',label='centroid')
plt.xlabel('Age')
plt.ylabel('Income ($)')
plt.legend()

Preprocessing using min max scaler

scaler = MinMaxScaler()

scaler.fit(df[['Income($)']])
df['Income($)'] = scaler.transform(df[['Income($)']])

scaler.fit(df[['Age']])
df['Age'] = scaler.transform(df[['Age']])
df.head()
plt.scatter(df.Age,df['Income($)'])

km = KMeans(n_clusters=3)
y_predicted = km.fit_predict(df[['Age','Income($)']])
y_predicted
df['cluster']=y_predicted
df.head()

km.cluster_centers_

Elbow Plot

sse = []
k_rng = range(1,10)
for k in k_rng:
km = KMeans(n_clusters=k)
km.fit(df[['Age','Income($)']])
sse.append(km.inertia_)

plt.xlabel('K')
plt.ylabel('Sum of squared error')
plt.plot(k_rng,sse)
Activity Name 

Group No. 

Student Roll No.


C P Domain +
L L Taxonom
No. O O y Criteria Awarded Score (out of 4 for each cell)

Operational Skills for Anaconda

1 5 5 P3
/Python /Spyder

Internship Report
No ratings yet
Internship Report
6 pages
09.unsupervised Learning
No ratings yet
09.unsupervised Learning
50 pages
Is 7310-1
No ratings yet
Is 7310-1
23 pages
Deep Learning Based On Cotton Leaf Disease Detection Using DesnseNet
No ratings yet
Deep Learning Based On Cotton Leaf Disease Detection Using DesnseNet
55 pages
GRADUATION COMPLETION SOUVENIR PROGRAM CLASS 2024 MB
No ratings yet
GRADUATION COMPLETION SOUVENIR PROGRAM CLASS 2024 MB
32 pages
Assignment 6 ML
No ratings yet
Assignment 6 ML
4 pages
L7 Clustering
No ratings yet
L7 Clustering
58 pages
Unit IV
No ratings yet
Unit IV
96 pages
04-FSSR DS610 2024 2025T1 Kmeans
No ratings yet
04-FSSR DS610 2024 2025T1 Kmeans
57 pages
ML UNIT 4 Sir
No ratings yet
ML UNIT 4 Sir
42 pages
Week 14 and 15 Machine Learning Unsupervised 2
No ratings yet
Week 14 and 15 Machine Learning Unsupervised 2
25 pages
K Means
No ratings yet
K Means
25 pages
Clustering
No ratings yet
Clustering
32 pages
Ray Optics
No ratings yet
Ray Optics
125 pages
Lecture - 10 Unsupervised Learning & K-Means Clustering
No ratings yet
Lecture - 10 Unsupervised Learning & K-Means Clustering
31 pages
ML Unit 4 V1
No ratings yet
ML Unit 4 V1
30 pages
KMeans Clustering Report
No ratings yet
KMeans Clustering Report
2 pages
A Thesis ON: Pharma Marketing Management
No ratings yet
A Thesis ON: Pharma Marketing Management
3 pages
Machine Learning K Means - Unsupervised
No ratings yet
Machine Learning K Means - Unsupervised
5 pages
Clustering
No ratings yet
Clustering
18 pages
Som New
No ratings yet
Som New
21 pages
Business Plan
No ratings yet
Business Plan
22 pages
Roles of The HRD Managers PDF
No ratings yet
Roles of The HRD Managers PDF
25 pages
Unsupervised Learning Final
No ratings yet
Unsupervised Learning Final
17 pages
Project Report On Implementation of Some Basic Hardware Designs at FPGA Using VERILOG
100% (1)
Project Report On Implementation of Some Basic Hardware Designs at FPGA Using VERILOG
27 pages
EAI13
No ratings yet
EAI13
19 pages
Machine Learning Chapter 3
No ratings yet
Machine Learning Chapter 3
12 pages
Zagreb Hotel
No ratings yet
Zagreb Hotel
2 pages
Algo
No ratings yet
Algo
59 pages
Unit 4
No ratings yet
Unit 4
53 pages
K Means Clustering
No ratings yet
K Means Clustering
11 pages
Mini Project
No ratings yet
Mini Project
8 pages
ML Unit 4
No ratings yet
ML Unit 4
110 pages
WINSEM2023-24 BEEE410L TH VL2023240502246 2024-03-22 Reference-Material-I
No ratings yet
WINSEM2023-24 BEEE410L TH VL2023240502246 2024-03-22 Reference-Material-I
95 pages
K Means Final
No ratings yet
K Means Final
10 pages
Lesson 5 - Unsupervised Learning
No ratings yet
Lesson 5 - Unsupervised Learning
11 pages
Minor Project
No ratings yet
Minor Project
10 pages
CLUSTERING
No ratings yet
CLUSTERING
11 pages
Wa0033.
No ratings yet
Wa0033.
38 pages
Stems
No ratings yet
Stems
46 pages
Assignment 2
No ratings yet
Assignment 2
8 pages
Crop Yield Prediction Presentation
No ratings yet
Crop Yield Prediction Presentation
8 pages
Loan Default Prediction Using Machine Learning
No ratings yet
Loan Default Prediction Using Machine Learning
8 pages
Unsupervised Unit 1
No ratings yet
Unsupervised Unit 1
12 pages
Clustering Part1
No ratings yet
Clustering Part1
84 pages
Introduction To Unsupervised Learning:: Clustering
No ratings yet
Introduction To Unsupervised Learning:: Clustering
21 pages
K - Means Clustering
No ratings yet
K - Means Clustering
13 pages
XVII - Cataloging Games
No ratings yet
XVII - Cataloging Games
13 pages
7.3 Curls
No ratings yet
7.3 Curls
6 pages
Lecture 9-1
No ratings yet
Lecture 9-1
5 pages
Presentation 1
No ratings yet
Presentation 1
47 pages
7.4 Stokes Theorem and Onwards
No ratings yet
7.4 Stokes Theorem and Onwards
4 pages
Job Advertisement For Junior Electrical Engineer
No ratings yet
Job Advertisement For Junior Electrical Engineer
2 pages
KMeans Clustering
No ratings yet
KMeans Clustering
16 pages
Energy Work and The Flow of Energy
No ratings yet
Energy Work and The Flow of Energy
13 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
27 pages
Barack Obama's Pauses and Gestures in Humorous Speeches
No ratings yet
Barack Obama's Pauses and Gestures in Humorous Speeches
9 pages
Fluid Mechanics White 7th SOL Part1 Part11
No ratings yet
Fluid Mechanics White 7th SOL Part1 Part11
5 pages
Intro To ML Ass
No ratings yet
Intro To ML Ass
3 pages
840-2 CST-2 2025
No ratings yet
840-2 CST-2 2025
2 pages
Aiml 8
No ratings yet
Aiml 8
7 pages
Week 9
No ratings yet
Week 9
66 pages
K-Means Clustering
No ratings yet
K-Means Clustering
5 pages
Chinese Checkers Board With Dragon
No ratings yet
Chinese Checkers Board With Dragon
7 pages
Unit 4 Aam
No ratings yet
Unit 4 Aam
26 pages
ML Unit5 Notes
No ratings yet
ML Unit5 Notes
18 pages
Facebook Live Seller
No ratings yet
Facebook Live Seller
8 pages
Safari
100% (1)
Safari
5 pages
Deviceinfo 20230724-11 39 05
No ratings yet
Deviceinfo 20230724-11 39 05
9 pages
UNIT-5 Material
No ratings yet
UNIT-5 Material
42 pages
Unit 4
No ratings yet
Unit 4
125 pages
Earth Science SHS 1.1 Worksheet 1
No ratings yet
Earth Science SHS 1.1 Worksheet 1
1 page
Philosphy Paper 1 Final
No ratings yet
Philosphy Paper 1 Final
3 pages
DSUP Exp5
No ratings yet
DSUP Exp5
7 pages
1.lab Activity
No ratings yet
1.lab Activity
5 pages
UV Cure Conformal Coating
No ratings yet
UV Cure Conformal Coating
3 pages
NES 729 Part 3 Requirements For Non-Destructive Examination Methods
No ratings yet
NES 729 Part 3 Requirements For Non-Destructive Examination Methods
48 pages
K Means Clustering Report
No ratings yet
K Means Clustering Report
3 pages
C V K C: Urriculum Itae Alyani Hadha
No ratings yet
C V K C: Urriculum Itae Alyani Hadha
7 pages
Area of Circle (Function Outside The Class Body)
No ratings yet
Area of Circle (Function Outside The Class Body)
1 page
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
10 pages
Tirumala Tirupati Devasthanams (Official Booking Portal)
No ratings yet
Tirumala Tirupati Devasthanams (Official Booking Portal)
1 page
Clustering Algorithm
No ratings yet
Clustering Algorithm
47 pages
K Means Clustering - Experiment 12
No ratings yet
K Means Clustering - Experiment 12
3 pages
Exp 7
No ratings yet
Exp 7
3 pages
U1 - KMeans - 5th Sem - DS
No ratings yet
U1 - KMeans - 5th Sem - DS
14 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
23 pages
K Means Algorithm
No ratings yet
K Means Algorithm
4 pages
B - S Grewal - Higher Engineering Mathematics PDF - Download B
No ratings yet
B - S Grewal - Higher Engineering Mathematics PDF - Download B
3 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
12 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
9 pages
Clustering Kmeans
No ratings yet
Clustering Kmeans
6 pages
English Lesson Form 4 11 Nov
No ratings yet
English Lesson Form 4 11 Nov
1 page
K Means
No ratings yet
K Means
9 pages
Swherosjourney
No ratings yet
Swherosjourney
2 pages
Software Engineering Course Outline
No ratings yet
Software Engineering Course Outline
3 pages
K Mean
No ratings yet
K Mean
7 pages
Melc Ia Smaw G7-8
100% (3)
Melc Ia Smaw G7-8
2 pages
Advertisment For Internship 2025N
No ratings yet
Advertisment For Internship 2025N
2 pages
ASD The Meltdown
No ratings yet
ASD The Meltdown
4 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

10.lab Activity

Uploaded by

10.lab Activity

Uploaded by

Experiment No.

This lab will guide you through:

 Learn the fundamentals of unsupervised learning and its application in data

Unsupervised Classification (Clustering)

Key Points in K-Means Clustering:

Preprocessing using min max scaler

Student Roll No.

Operational Skills for Anaconda

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.