0% found this document useful (0 votes)

139 views

Artificial Intelligence: Semester Project

This document presents a semester project on artificial intelligence using clustering algorithms on cricket statistics. It introduces the topic, dataset, and three clustering algorithms (K-means, Agglomerative, and Mean-Shift) that will be applied. The dataset contains statistics on the top batsmen from T20 cricket matches in 2016. The document outlines the process, advantages, and disadvantages of each clustering algorithm and provides the table of contents for the subsequent sections that will present and analyze the results of applying the three algorithms to the cricket statistics dataset.

Uploaded by

Abdullah Ammar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

139 views

Artificial Intelligence: Semester Project

Uploaded by

Abdullah Ammar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Artificial Intelligence

CSC-411

Semester Project

Instructor
Dr. Samabia Tehseem

Submitted by
Abdur Rehman Anwar
01-134142-199

Suneel Kumar
01-134142-202

Muhammad Ejaz
01-134142-201

Mian Usama Tariq

01-134142-119

Department of Computer Science

Bahria University, Islamabad

19-12-2017
Table of Contents
1. Introduction ............................................................................................................................................ 3
1.1 Domain ............................................................................................................................................... 3
1.2 Application ......................................................................................................................................... 3
2. Dataset ..................................................................................................................................................... 3
2.1 Details..................................................................................................................................................... 3
2.2 Source .................................................................................................................................................... 3
3. Algorithms .............................................................................................................................................. 3
3.1 K-means ................................................................................................................................................. 3
3.1.1 Process ........................................................................................................................................... 3
3.1.2 Advantages .................................................................................................................................... 3
3.1.3 Disadvantages ............................................................................................................................... 3
3.2 Agglomerative ....................................................................................................................................... 4
3.2.1 Process ........................................................................................................................................... 4
3.2.2 Advantages .................................................................................................................................... 4
3.2.3 Disadvantages ............................................................................................................................... 4
3.3 Mean-Shift.............................................................................................................................................. 4
3.3.1 Process ....................................................................................................................................... 4
3.3.2 Advantages .................................................................................................................................... 4
3.3.3 Disadvantages ............................................................................................................................... 4
4. Result and Analysis.............................................................................................................................. 5
4.1. Screen Shots of Python Based Graphical User Interface (GUI) ............................................................. 5
4.1.1. Initial Layout of this system. .................................................................................................... 5
4.1.2. When user wants to load required file using this project, then it opens dialog box to
load file...................................................................................................................................................... 5
4.1.3. If user tries to load other extension files which are not allowed through this system,
then it shows error message by using message box. ....................................................................... 5
4.1.4. If user uploaded respective extension file which is allowed through this system
specification, then it shows confirmation message by using message box. .................................. 5
4.1.5. This is complete layout of this system, after that system works on the three clustering
algorithms as they mentioned by using button for each algorithm. ................................................. 6
4.2. K-Mean Result ..................................................................................................................................... 6
4.2.1. K-means Graph ............................................................................................................................ 6
4.2.2. K-means Analysis ........................................................................................................................ 6
4.3. Agglomerative Result .......................................................................................................................... 7
4.3.1. Agglomerative Graph................................................................................................................... 7
4.3.2. Agglomerative Analysis ............................................................................................................... 7
4.4. Mean-Shift Result ................................................................................................................................ 7
4.4.1. Mean-Shift Graph ......................................................................................................................... 7
4.4.2. Mean-Shift Analysis ..................................................................................................................... 7
1. Introduction
1.1 Domain
We selected Machine Learning as a domain of Artificial Intelligence project. Many result can be
produced using clustering algorithms on cricket statistics. It is a vast domain for these types of
algorithms.
1.2 Application
 To predict match outcomes.
 To predict player’s performance.

2. Dataset
2.1 Details
The dataset we choose for our project is about t20 most runs by batsman in 2016. It is a numeric
dataset which contains 14 columns and 50 rows. In columns there is players name, matches,
innings, not outs, runs, average runs, strike rate, run rate, best score, 100s, 50, 6s, 4s and 0s.
2.2 Source
We took this dataset from world famous dataset repository Kaggle. This dataset is available in the
Kaggle website with name T20 Cricket Most Runs 2016 in the below link,
https://www.kaggle.com/frankfernandes/t20mostruns2016/

3. Algorithms
3.1 K-means
k-means clustering algorithm is one of the unsupervised learning algorithms in machine learning. It
can be sense from its name that it makes k clusters based on k means. It takes distance of new
given data with all means then compare these distances and assigns this given data to the cluster
which have least distance from it. After this it calculate means of each cluster and then it repeats
the same process until new means are become equal to previous means.
3.1.1 Process
Assume that we have dataset and set of k centroids.
 Iterates through all dataset
 Calculate distance between each data feature with these centroids
 Assign this data to the cluster which have minimum distance from this data
 At the end recalculate new centroids using the values in their cluster
 Repeat the process until new centroids are equal to the previous centroids
3.1.2 Advantages
 It is a simple and easy algorithm which can be written by beginners
 It is fast as compare to the other clustering algorithms
 It is best for distinct data values in the dataset
3.1.3 Disadvantages
 It is not best for overlapping data values in the dataset as it can’t decide the cluster of these
values
 It gives different result for different representation of the same dataset.
 Euclidean distance is not efficient for calculating distance for different data values in the
dataset
 When we choose data centers randomly it can’t lead us to the accurate results
 It is useless for nosey data and not good for outliers.
 Dataset should be linear otherwise it will not give correct results.
3.2 Agglomerative
It is a hierarchical clustering algorithm. It clusters dataset from bottom to up as every cluster in the
top has sub-clusters and these sub-clusters also have their own sub-clusters.
3.2.1 Process
 Consider each value as a cluster.
 Construct a distance matrix.
 Merge two clusters with minimum distance.
 Reconstruct the distance matrix with these new clusters.
 Repeat the process until distance matrix is reduced to two elements.
3.2.2 Advantages
 It can generate the order of objects, which can be useful for the data visualization.
 Lesser clusters are created, which can be supportive for detection.
 It is simple to implement and have multiple applications
3.2.3 Disadvantages
 There can be a chance it produced different result as it uses different distance matrices to calculate
distance.
 It can produce imbalances clusters.
 It is very difficult to choose number of clusters.

3.3 Mean-Shift
Mean shift is one of the nonparametric clustering algorithms with does not need the number of
clusters to be generated and there is no restriction for cluster’s shape. It shifts means into the
region with high density until convergence occur. It iterates through the region of interest and
calculate mean of all values in that region and shift center to the newly calculated mean. In this
way it calculates all centroids for the clusters.
3.3.1 Process
 First take random centroids and radius for the area of interest.
 Calculate distance of all dataset from this centroid.
 Choose those values which are within the defined area of interest.
 Calculate the mean of all chosen values.
 Shift the centroid to the newly calculated mean of values.
 Repeat the same process until the there is no shifting between centroids.
3.3.2 Advantages
 Mean shift algorithm is independent of the type of application.
 It is very simple and easy to implement.
 It does not require predefined shape of the clusters.
 It can be used to cluster dataset with any type of features.
 It depends on the bandwidth of the cluster.
3.3.3 Disadvantages
 Bandwidth should be taken carefully otherwise result will be inaccurate.
 Window size to calculate kernel should be non-trivial.
 As it picks first centroid randomly which can be an outlier.
4. Result and Analysis
4.1. Screen Shots of Python Based Graphical User Interface (GUI)

4.1.1. Initial Layout of this system.

4.1.2. When user wants to load required file using this project, then it opens dialog box
to load file.

4.1.3. If user tries to load other extension files which are not allowed through this system,
then it shows error message by using message box.

4.1.4. If user uploaded respective extension file which is allowed through this system
specification, then it shows confirmation message by using message box.
4.1.5. This is complete layout of this system, after that system works on the three
clustering algorithms as they mentioned by using button for each algorithm.

4.2. K-Mean Result

4.2.1. K-means Graph

4.2.2. K-means Analysis

In this system, K-means algorithm is used to make cluster of attribute “RUNS” from the
T20 Dataset. It considers K=2 in order to make two cluster as given in above graph. It is concluded
that decision boundary of this cluster is cleared for attribute “RUNS” from T20 Dataset.
4.3. Agglomerative Result
4.3.1. Agglomerative Graph

4.3.2. Agglomerative Analysis

In this system, Agglomerative algorithm is used to make cluster of attribute “RUNS”
from the T20 Dataset. It gives two cluster as given in above graph. It is concluded that decision
boundary is not cleared as K-mean algorithm gives for attribute “RUNS” from T20 Dataset.

4.4. Mean-Shift Result

4.4.1. Mean-Shift Graph

4.4.2. Mean-Shift Analysis

In this system, Mean-Shift algorithm is used to make cluster of attributes “RUNS” and
“NumberOfMatches” from the T20 Dataset as It is useless to apply Mean-Shift algorithm in single
attribute of any dataset due to which we took two attributes to make cluster through this algorithm.
In the graph, it is mentioned that there are two clusters with blue and green color clusters
respectively. There are some outliers too, which are shown with yellow color.

Project Report On Real State Web App
75% (4)
Project Report On Real State Web App
124 pages
Bhabha: The Other Question
100% (2)
Bhabha: The Other Question
5 pages
Grade 8-Operations-on-Rational-Algebraic-Expressions
100% (3)
Grade 8-Operations-on-Rational-Algebraic-Expressions
8 pages
The Hound of The Baskervilles Creative Project
No ratings yet
The Hound of The Baskervilles Creative Project
7 pages
ML Module 4 2022 1 PDF
No ratings yet
ML Module 4 2022 1 PDF
31 pages
Clustering Part-1
No ratings yet
Clustering Part-1
48 pages
Software Patterns Made Easy
From Everand
Software Patterns Made Easy
Justice Nanhou
No ratings yet
K MEANS
No ratings yet
K MEANS
40 pages
Data Mining Presentation
No ratings yet
Data Mining Presentation
154 pages
Unit 4 Descriptive Modeling
No ratings yet
Unit 4 Descriptive Modeling
18 pages
Unit 4
No ratings yet
Unit 4
4 pages
Data Mining1
No ratings yet
Data Mining1
13 pages
DA_EXP_10_66
No ratings yet
DA_EXP_10_66
6 pages
Fuzzypaper May No K
No ratings yet
Fuzzypaper May No K
20 pages
Assignment No. A6: 1 Title
No ratings yet
Assignment No. A6: 1 Title
5 pages
Intro Data Science: Cluster Analysis
No ratings yet
Intro Data Science: Cluster Analysis
60 pages
K-Means Clustering Method For The Analysis of Log Data
No ratings yet
K-Means Clustering Method For The Analysis of Log Data
3 pages
Introduction To Data Analytics MCA-3282 Open Elective - 6 Sem B.Tech Topic - Grouping
No ratings yet
Introduction To Data Analytics MCA-3282 Open Elective - 6 Sem B.Tech Topic - Grouping
44 pages
Predict Classify Cluster
No ratings yet
Predict Classify Cluster
12 pages
K Mean
No ratings yet
K Mean
7 pages
Machine Learning Notes-1 (Clustering-1)
No ratings yet
Machine Learning Notes-1 (Clustering-1)
25 pages
Mod 4 - CLustering
No ratings yet
Mod 4 - CLustering
55 pages
Analysis and Study of K Means Clustering Algorithm IJERTV2IS70648
No ratings yet
Analysis and Study of K Means Clustering Algorithm IJERTV2IS70648
6 pages
genedata doc
No ratings yet
genedata doc
67 pages
9-Types of data in cluster analysis, Partitioning methods-21-10-2024
No ratings yet
9-Types of data in cluster analysis, Partitioning methods-21-10-2024
54 pages
Application of K-Means 1002.2425 PDF
No ratings yet
Application of K-Means 1002.2425 PDF
4 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
ML Unit 4 Notes - NJ
No ratings yet
ML Unit 4 Notes - NJ
15 pages
Ambo University Inistitute of Technology Department of Computer Science
No ratings yet
Ambo University Inistitute of Technology Department of Computer Science
13 pages
Unit - 4 DM
No ratings yet
Unit - 4 DM
24 pages
Evaluating Student's Performance Using K-Means Clustering: Rakesh Kumar Arora, Dr. Dharmendra Badal
No ratings yet
Evaluating Student's Performance Using K-Means Clustering: Rakesh Kumar Arora, Dr. Dharmendra Badal
5 pages
Unit 4
No ratings yet
Unit 4
5 pages
BIL Report
No ratings yet
BIL Report
24 pages
AI-unit-5
No ratings yet
AI-unit-5
103 pages
ML+Clustering
No ratings yet
ML+Clustering
33 pages
Unit-7 Finalized
No ratings yet
Unit-7 Finalized
20 pages
Cluster Analysis: Dr. Bernard Chen Ph.D. Assistant Professor
No ratings yet
Cluster Analysis: Dr. Bernard Chen Ph.D. Assistant Professor
43 pages
Clustering For Clasification
No ratings yet
Clustering For Clasification
13 pages
A Discretization Method For Industrial Data Based On Big Data Technology
No ratings yet
A Discretization Method For Industrial Data Based On Big Data Technology
3 pages
Report of Assignment 3 ML
No ratings yet
Report of Assignment 3 ML
6 pages
Data Mining - Clustering
No ratings yet
Data Mining - Clustering
90 pages
A Parallel Study On Clustering Algorithms in Data Mining
No ratings yet
A Parallel Study On Clustering Algorithms in Data Mining
7 pages
Building K-Means Clustering Algorithm From Scratch
No ratings yet
Building K-Means Clustering Algorithm From Scratch
10 pages
Clustering
No ratings yet
Clustering
65 pages
Graph Partitioning Advance Clustering Technique
No ratings yet
Graph Partitioning Advance Clustering Technique
14 pages
Unit 4 Introduction to Algorithm
No ratings yet
Unit 4 Introduction to Algorithm
10 pages
Introduction To (Statistical) Machine Learning
No ratings yet
Introduction To (Statistical) Machine Learning
30 pages
Clustering-Part 1
No ratings yet
Clustering-Part 1
35 pages
Jaipur National University: Project Design With Seminar
100% (1)
Jaipur National University: Project Design With Seminar
26 pages
Algorithms New
No ratings yet
Algorithms New
8 pages
Clustering Algorithm: A Fundamental Operation in Data Mining
No ratings yet
Clustering Algorithm: A Fundamental Operation in Data Mining
44 pages
DM UNIT IV (1)
No ratings yet
DM UNIT IV (1)
45 pages
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
No ratings yet
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
20 pages
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
No ratings yet
20 - 1 - ML - Unsup - 01 - Partition Based - Kmeans
20 pages
DSE Lab Assignment - Writeup - 7
No ratings yet
DSE Lab Assignment - Writeup - 7
4 pages
Machine Learning-4
No ratings yet
Machine Learning-4
73 pages
Chapter - 1: 1.1 Overview
No ratings yet
Chapter - 1: 1.1 Overview
50 pages
A Genetic K-Means Clustering Algorithm Based On The Optimized Initial Centers
No ratings yet
A Genetic K-Means Clustering Algorithm Based On The Optimized Initial Centers
7 pages
Ambo University: Inistitute of Technology
No ratings yet
Ambo University: Inistitute of Technology
15 pages
DMBI IAT-2 IMP QUES SOLN
No ratings yet
DMBI IAT-2 IMP QUES SOLN
43 pages
ML Exp5 C36
No ratings yet
ML Exp5 C36
18 pages
Machine Learning & Data Mining
No ratings yet
Machine Learning & Data Mining
108 pages
Grouping
No ratings yet
Grouping
98 pages
AppliedML-Chap1-Clustering
No ratings yet
AppliedML-Chap1-Clustering
37 pages
Final Assignment LAN Switching & Wireless Networks: Batch: 16BS (CS) Due Date: 10-09-2018
No ratings yet
Final Assignment LAN Switching & Wireless Networks: Batch: 16BS (CS) Due Date: 10-09-2018
11 pages
LAN Switching & Wireless Networks
No ratings yet
LAN Switching & Wireless Networks
47 pages
LAN Switching & Wireless Networks
No ratings yet
LAN Switching & Wireless Networks
45 pages
LAN Switching & Wireless Networks: Suez Canal University - Faculty of Computers & Informatics - Cisco Local Academy
No ratings yet
LAN Switching & Wireless Networks: Suez Canal University - Faculty of Computers & Informatics - Cisco Local Academy
29 pages
Assignment
No ratings yet
Assignment
2 pages
LAN Switching & Wireless Networks
No ratings yet
LAN Switching & Wireless Networks
66 pages
Normalization For Relational Databases
No ratings yet
Normalization For Relational Databases
47 pages
Presentation of Type Classes
No ratings yet
Presentation of Type Classes
15 pages
Final Presentation On Real State Web App
No ratings yet
Final Presentation On Real State Web App
13 pages
Week 9 PDF
No ratings yet
Week 9 PDF
70 pages
MATLAB Notes1
No ratings yet
MATLAB Notes1
159 pages
Subject: Artificial Intelligence Instructor: Dr. Samabia Tehseen Assignment # 1 Assigned To: BSCS 7A Date: 2017
No ratings yet
Subject: Artificial Intelligence Instructor: Dr. Samabia Tehseen Assignment # 1 Assigned To: BSCS 7A Date: 2017
1 page
CS429: Data Mining: About Instructor
No ratings yet
CS429: Data Mining: About Instructor
26 pages
Week 8 PDF
No ratings yet
Week 8 PDF
42 pages
Artificial Intelligence Slide 2
No ratings yet
Artificial Intelligence Slide 2
38 pages
Regression Example
No ratings yet
Regression Example
405 pages
Lucas v. Tuaño, G.R. No. 178763, April 21, 2009
No ratings yet
Lucas v. Tuaño, G.R. No. 178763, April 21, 2009
22 pages
Congress Getting Weaker by The Day - G-23' Leaders - Times of India
No ratings yet
Congress Getting Weaker by The Day - G-23' Leaders - Times of India
2 pages
3 Phase A.C. Circuits
No ratings yet
3 Phase A.C. Circuits
35 pages
Modul For Preparation Course (OCTOBER 2021)
No ratings yet
Modul For Preparation Course (OCTOBER 2021)
16 pages
Oral Report
No ratings yet
Oral Report
4 pages
Lecture On The Basic Concepts of Small Family Business
No ratings yet
Lecture On The Basic Concepts of Small Family Business
4 pages
The Soul Mate Phenomenon
No ratings yet
The Soul Mate Phenomenon
90 pages
Download full Lifespan Development 7th Edition Boyd Test Bank (PDF) with all chapters
100% (11)
Download full Lifespan Development 7th Edition Boyd Test Bank (PDF) with all chapters
61 pages
Operating System Security 2008
100% (11)
Operating System Security 2008
236 pages
Night of Decree Musa Jibril
No ratings yet
Night of Decree Musa Jibril
2 pages
International Journal of Rock Mechanics and Mining Sciences: Hongqiang Song, Jianping Zuo, Haiyan Liu, Shuhao Zuo
No ratings yet
International Journal of Rock Mechanics and Mining Sciences: Hongqiang Song, Jianping Zuo, Haiyan Liu, Shuhao Zuo
13 pages
Module 4 - The Child and Adolescent Learners and Learning Principles
100% (1)
Module 4 - The Child and Adolescent Learners and Learning Principles
11 pages
BURNOUT_IN_THE_WORKPLACE
No ratings yet
BURNOUT_IN_THE_WORKPLACE
29 pages
69 - Digest - Bunag Vs CA - G.R. No. 101749 - 10 July 1992
No ratings yet
69 - Digest - Bunag Vs CA - G.R. No. 101749 - 10 July 1992
3 pages
Early-Middle Mississippian Stethacanthus (Chondrichthyes Symmoriiformes) From The Lavender Shale Member of The Fort Payne Formation, Northwestern Georgia
No ratings yet
Early-Middle Mississippian Stethacanthus (Chondrichthyes Symmoriiformes) From The Lavender Shale Member of The Fort Payne Formation, Northwestern Georgia
7 pages
Hockett, C. A System of Descriptive Phonology
No ratings yet
Hockett, C. A System of Descriptive Phonology
20 pages
Hidden History The Horror of Jasenovac Wanda B Schindley
No ratings yet
Hidden History The Horror of Jasenovac Wanda B Schindley
14 pages
Homework Assignment No.7 - 163016
50% (2)
Homework Assignment No.7 - 163016
2 pages
Allottees As On 07-01-2016
No ratings yet
Allottees As On 07-01-2016
73 pages
BÁO CÁO TOÀN VĂN _ Đỗ Hồng Hạnh
No ratings yet
BÁO CÁO TOÀN VĂN _ Đỗ Hồng Hạnh
66 pages
HRM PRP vk65
No ratings yet
HRM PRP vk65
10 pages
How To Teach Speaking
100% (1)
How To Teach Speaking
15 pages
(New Studies in American Intellectual and Cultural History) Oz Frankel - States of Inquiry_ Social Investigations and Print Culture in Nineteenth-Century Britain and the United States-The Johns Hopkin
100% (1)
(New Studies in American Intellectual and Cultural History) Oz Frankel - States of Inquiry_ Social Investigations and Print Culture in Nineteenth-Century Britain and the United States-The Johns Hopkin
383 pages
Books and Authors
No ratings yet
Books and Authors
7 pages
Introduction To Project Appraisal
100% (2)
Introduction To Project Appraisal
3 pages
The Four Pillars of Education
No ratings yet
The Four Pillars of Education
19 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Artificial Intelligence: Semester Project

Uploaded by

Artificial Intelligence: Semester Project

Uploaded by

Artificial Intelligence

Mian Usama Tariq

Department of Computer Science

4.1.1. Initial Layout of this system.

4.2. K-Mean Result

4.2.2. K-means Analysis

4.3.2. Agglomerative Analysis

4.4. Mean-Shift Result

4.4.2. Mean-Shift Analysis

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.