0% found this document useful (0 votes)

27 views3 pages

NB 7

Uploaded by

patelshruti522

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views3 pages

NB 7

Uploaded by

patelshruti522

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Rucha Shinde et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol.

6 (1) , 2015, 637-639

An Intelligent Heart Disease Prediction System

Using K-Means Clustering and Naïve Bayes
Algorithm
Rucha Shinde(1), Sandhya Arjun(2), Priyanka Patil (3),Prof. Jaishree Waghmare(4)
Trinity College of Engineering & Research, Pune

Abstract—Nowadays people work on computers for hours

and hours they don’t have time to take care of themselves. II. LITERATURE REVIEW
Due to hectic schedules and consumption of junk food it [1] Intelligent heart disease prediction system using
affects the health of people and mainly heart. So to we are
data mining techniques:
implementing an heart disease prediction system using data
mining technique Naïve Bayes and k-means clustering In this paper heart disease prediction is done using data
algorithm. It is the combination of both the algorithms. This mining techniques such as decision trees, neural network
paper gives an overview for the same. It helps in predicting and naïve bayes. This system answers “what if ” query. It is
the heart disease using various attributes and it predicts the implement on .net platform. It is used for heart disease
output as in the prediction form. For grouping of various prediction.
attributes it uses k-means algorithm and for predicting it uses [2] An empirical study on applying data mining
naïve bayes algorithm. techniques for analysis and prediction of heart disease:
It is found that health environment is poor in extracting
Index Terms —Data mining, Comma separated files, naïve
bayes, k-means algorithm, heart disease. knowledge so in this paper data mining techniques are
applied . this paper deals with application of data mining.
[3] Prediction system for heart disease using naïve
I. INTRODUCTION bayes mining:
The practice of examining large preexisting data bases It is web-based classification. It retrives hidden data
in order to generate new information. It coverts raw data from database. It compare the value with trained dataset. In
into useful information. It analyze the data for relationships this paper it is mentioned that because of this system the
that have not previously been discovered. [1] treatment cost are reduced.
The steps of data mining are: Data cleaning, data [4] Decision support in heart disease prediction
integration, data selection, data transformation, data mining, system using naïve mining:
pattern evaluation and knowledge representation. This research developed using data mining techniques
Medical data mining is a domain of lot of imprecision mainly naïve bayes. It takes input as the patients attributes.
and uncertainty. The clinical decisions are usually based on It helps trained nurses and medical students to treat patients.
the doctors intuition. Therefore this may lead to disastrous [5] Intelligent and effective heart attack prediction
consequences. Due to this there are many errors in the system using data mining:
clinical decisions and it results in excessive medical costs. In this paper k-means clustering is used. This system
[1] capable of predicting heart disease.
Serialization is also used in this system. It converts the
data objects into streams of bytes and stores it into database.
III. BLOCK DIAGRAM
The following block diagram represents the step by step
implementation of the heart disease prediction system. The
block diagram consist of two sets first one is the training
set and the other one is prediction. In training set firstly the
input is taken i.e the patients attributes then a dataset is
being formed. After that dataset is given labels according
to the name of the attributes. Then on the dataset
transformation is done means the attributes are separated
through comma separated vector i.e C.S.V files. After that
on these dataset K-Means clustering algorithm is applied,
here the grouping of the attributes is done and the attributes
are added according to their groups. After this model is
ready to apply prediction algorithm on it.
Figure1:-Data Mining Process In prediction system, for prediction we used naïve
bayes algorithm. Naïve bayes basically applies probability

www.ijcsit.com 637
Rucha Shinde et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 6 (1) , 2015, 637-639

concept. It integrates on each and every attribute and gives Requirements of clustering in data mining:-
the result. The output which we would get will be 1) Deals with different types of attributes.
prediction the person is having heart disease or he is likely 2) Deals with noise data
to have heart disease. This system will help him to take the 3) It requires minimum knowledge to determine input
preventive measures from not getting the disease. parameter.
4) Usability
5) More dimensionality

K-MEANS CLUSTERING
K-means is most simplest learning algorithm to solve the
clustering problems. The process is simple and easy, it
classifies given data set into certain number of clusters.
It defines k centriods for each cluster. They must be
placed as much as possible far away from each other. Then
take each point belonging to given data set and relate into
the nearest centriod. If no point is pending then an group
age is done. Then we re-calculate k new centroid for the
cluster resulting from previous steps. When we get the k
centroid a new binding is to be done between sane data
points and nearest centroid. A loop is been generated
because of this loop key centriod change the location step
by step until no more changes are done.[4]
Fig. 2:Block diagram
The advantages of k means clustering algorithm are
simplicity and speed.
IV. COMMA SEPARATED VALUES(CSV) Algorithm:-
1) Select k center from the problem(random)
The full form of CSV is Comma Separated Values. In 2) Divide data into k clusters by grouping points.
older days CSV is originally named as CSL i.e Comma 3) Calculate the mean of k cluster to find new centers.
Separated List.[2][3] 4) Repeat steps 2 and 3 until centers do not change.
A CSV file stores plain text form into tabular data. Plain
test contains character with no data as binary number. A In this system we mainly used clustering for grouping
CSV file contains records which are separated by some the attributes. As we take almost 10 attributes such as age
character or strings mainly by comma or tab. CSV refers to In this system we take various attributes such as age,
large family of formats. obesity, gender, cholesterol, smoker ,blood pressure, chest
It is supported by consumer and business applications. It pain ,blood sugar, ECG results etc. this attributes are
is mainly used when user needs to transfer information grouped using K-Means clustering algorithm
from a database programs to a spreadsheet that uses a Eg:- If we took an attribute such as age and we
completely different format.[2][3] considered the age of the person between 0-100. After
In CSV files records are divided into fields separated by applying K-means algorithm on this dataset of age it will
delimiters. They work both with UNICODE and ASCII. find the centriod and divide it into groups. It calculate the
CSV files translates one character set to another. mean. Here, age will be divided into 3 groups such as from
CSV files cannot represent object oriented database 0-30,31-60,61-100.
because CSV records excepted to have same structures. It will give them values such as
CSV files are also called as flat files.[2][3] 0-30=0
In this system we take various attributes such as age, 31-60=1
obesity, gender, cholesterol, smoker ,blood pressure, chest 61-100=2
pain ,blood sugar, ECG results etc. As we take this input For gender attribute it will divide into groups such as
one by one this inputs are separated using CSV files. This Male=0
inputs are converted into a tabular format and are separated Female=1
using comma. K-means will be applied on each and every attribute
Because of CSV files data appear in a sophisticated and mentioned above.
in well- represented manner. After that the attributes and their values will be added in a
dataset accordingly. Then the model is being ready for
V. K-MEANS CLUSTERING ALGORITHM prediction.
Clustering is the process of grouping of data objects that VI. NAÏVE BAYES ALGORITHM
are same to one other within the cluster. They even
grouped dissimlar objects into another cluster. It is also Naïve Bayes classifier is based on Bayes theorem. It
called as data segmentation in some applications because it has strong independence assumption. It is also known as
divides large data set into groups according to the independent feature model.
similarities.[4]

www.ijcsit.com 638
Rucha Shinde et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 6 (1) , 2015, 637-639

It assumes the presence or absence of a particular VIII. CONCLUSION

feature of a class is unrelated to the presence or absence of In this paper we are proposing heart disease prediction
any other feature in the given class. system using naïve bayes and k-means clustering. We are
Naïve bayes classifier can be trained in supervised using k-means clustering for increasing the efficiency of
learning setting. It uses the method of maximum similarity. the output. This is the most effective model to predict
It has been worked in complex real world situation. It patients with heart disease. This model could answer
requires small amount of training data. It estimates complex queries, each with its own strength with respect to
parameters for classification. Only the variance of variable ease of model interpretation, access to detailed
need to be determined for each class not the entire information and accuracy
matrix.[5][6]
Naïve bayes is mainly used when the inputs are high. It
gives ouput in more sophisticated form. The probability of REFERENCES
each input attribute is shown from the predictable state. [1] Sellappan Palaniappan, Rafiah Awang “Intelligent Heart Disease
Machine learning and data mining methods are based on Prediction System Using Data Mining Techniques”Department of
naïve bayes classification. Information Technology Malaysia University of Science and
Technology Block C, Kelana Square, Jalan SS7/26 Kelana Jaya,
47301 Petaling Jaya, Selangor, Malaysia .
Bayes theorem:- [2] "CSV File Reading and Writing" (http:/ / docs. python. org/ library/
csv. html). . Retrieved July 24, 2011. "is no "CSV standard"”
P(H|X) = P(X|H) P(H) [3] Y. Shafranovich. "Common Format and MIME Type for Comma-
Separated Values (CSV) Files" (http:/ / tools. ietf. org/ html/
rfc4180) Retrieved September 12, 2011.
P(X) [4] home.deib.polimi.it/matteucc/Clustering/tutorial_html/kmeans.html
Where “A tutorial on clustering algorithms”.
P(H|X ) is posterior probability of H conditioned on X [5] Shadab Adam Pattekari and Asma Parveen “Prediction System
P(X|H) is posterior probability of X conditioned on H For Heart Disease Using Naïve Bayes” International Journal of
P(H)is prior probability of H Advanced Computer and Mathematical Sciences ISSN 2230-9624.
Vol 3, Issue 3, 2012, pp 290-294.
P(X) is prior probability of X [6] Mrs.G.Subbalakshmi (M.Tech), Mr. K. Ramesh M.Tech, Asst.
Professor Mr. M. Chinna Rao M.Tech,(Ph.D.) Asst. Professor,
Naïve bayes will basically predict the output “Decision Support in Heart Disease Prediction System using Naive
Bayes” G.Subbalakshmi et al. / Indian Journal of Computer Science
whether the patient will have chances of getting the heart
and Engineering (IJCSE)2011.
disease or not.
[7] Jesmin Nahar, Tasadduq Imama, Kevin S. Tickle, Yi-Ping
The model dataset which we get after applying K-Means Phoebe Chen “Association rule mining to detect factors which
algorithm will compared the values of dataset with a contribute to heart disease in males and females” Expert Systems
trained dataset. It will apply the bayes theorem and the with Applications 40 (2013) 1086–1093.
probability will be obtained whether the patient will have [8] Oleg Yu. Atkov (MD, PhD), Svetlana G. Gorokhova (MD, PhD),
heart disease or not.[5][6] Alexandr G. Sboev (PhD), Eduard V. Generozov (PhD), Elena V.
Muraseyeva (MD, PhD), Svetlana Y. Moroshkina,Nadezhda N.
Cherniy “Coronary heart disease diagnosis by artificial neural
VII. INPUT ATTRIBUTES networks including genetic polymorphisms and clinical parameters”
1) Age Journal of Cardiology (2012) 59, 190—194.
2) Gender [9] Shantakumar B.Patil Y.S.Kumaraswamy “Intelligent and Effective
Heart Attack Prediction System Using Data Mining and Artificial
3) Obesity Neural Network” European Journal of Scientific Research ISSN
4) Smoking 1450-216X Vol.31 No.4 (2009), pp.642-656.
5) Electrographic result [10] Sivagowry, Dr. Durairaj. M2 and Persia. “An Empirical Study on
6) Heart rate applying Data Mining Techniques for the Analysis and Prediction of
7) Chest pain Heart Disease” 2013.
8) Cholesterol
9) Blood pressure
10) Blood sugar

www.ijcsit.com 639

Thesis Updated
No ratings yet
Thesis Updated
151 pages
A Comparative Study For Predicting Heart Diseases Using Data Mining Classification Methods
No ratings yet
A Comparative Study For Predicting Heart Diseases Using Data Mining Classification Methods
12 pages
Lung Disease Prediction - Edited
No ratings yet
Lung Disease Prediction - Edited
35 pages
Cluster Based Mining For Prediction of H PDF
No ratings yet
Cluster Based Mining For Prediction of H PDF
8 pages
1744-5586-1-PB
No ratings yet
1744-5586-1-PB
9 pages
heart disease prediction using KNN algorithm-2
No ratings yet
heart disease prediction using KNN algorithm-2
19 pages
(IJCST-V10I5P44) :mrs J Sarada, Yeddula Pavani
No ratings yet
(IJCST-V10I5P44) :mrs J Sarada, Yeddula Pavani
5 pages
Saq Final
50% (2)
Saq Final
83 pages
Mini Research
No ratings yet
Mini Research
4 pages
Lung Disease Prediction Using K-Means Clustering and Naïve Bayes Algorithm
No ratings yet
Lung Disease Prediction Using K-Means Clustering and Naïve Bayes Algorithm
5 pages
Comparison of Various Data Mining Methods For Early Diagnosis of Human Cardiology
No ratings yet
Comparison of Various Data Mining Methods For Early Diagnosis of Human Cardiology
9 pages
K-means clustering using RapidMiner
No ratings yet
K-means clustering using RapidMiner
10 pages
jut2
No ratings yet
jut2
12 pages
Heart Prediction
No ratings yet
Heart Prediction
6 pages
Integrating Clustering With Different Data Mining Techniques in The Diagnosis of Heart Disease
No ratings yet
Integrating Clustering With Different Data Mining Techniques in The Diagnosis of Heart Disease
10 pages
Modernistic Approach To Clustering Algorithms
No ratings yet
Modernistic Approach To Clustering Algorithms
5 pages
Early Prediction of Heart Disease Using Decision Tree Algorithm
No ratings yet
Early Prediction of Heart Disease Using Decision Tree Algorithm
16 pages
Heart Attack Prediction System: K.Kannan K.Sasidharan B.V.Sasidhar S.Dhiraj
No ratings yet
Heart Attack Prediction System: K.Kannan K.Sasidharan B.V.Sasidhar S.Dhiraj
27 pages
Paper 4
No ratings yet
Paper 4
13 pages
Anusha, C._ Vinay, S.K._ Pooja Raj, H.J._ Ranganatha, S. - [Institution of Engineering and Technology National Conference on Challenges in Research & Technology in the Coming Decades Na (2013, Institution of Eng
No ratings yet
Anusha, C._ Vinay, S.K._ Pooja Raj, H.J._ Ranganatha, S. - [Institution of Engineering and Technology National Conference on Challenges in Research & Technology in the Coming Decades Na (2013, Institution of Eng
5 pages
Prediction of Heart Diseases Using Random Forest (Article)
No ratings yet
Prediction of Heart Diseases Using Random Forest (Article)
9 pages
NB 1
No ratings yet
NB 1
7 pages
Ijert Ijert: Decision Making To Predict Customer Preferences in Life Insurance
No ratings yet
Ijert Ijert: Decision Making To Predict Customer Preferences in Life Insurance
4 pages
IJCRT2205103
No ratings yet
IJCRT2205103
10 pages
Prediction of Heart Disease by Clustering and Classification Techniques Prediction of Heart Disease by Clustering and Classification Techniques
No ratings yet
Prediction of Heart Disease by Clustering and Classification Techniques Prediction of Heart Disease by Clustering and Classification Techniques
8 pages
Using Decision Trees in Data Mining For Predicting Factors Influencing of Heart Disease
No ratings yet
Using Decision Trees in Data Mining For Predicting Factors Influencing of Heart Disease
6 pages
Purnomo 2020 J. Phys. Conf. Ser. 1511 012001
No ratings yet
Purnomo 2020 J. Phys. Conf. Ser. 1511 012001
7 pages
Design and Implementation of High End Multiple Security Based ATM Monitoring System
No ratings yet
Design and Implementation of High End Multiple Security Based ATM Monitoring System
3 pages
Pandi A Raj 2021
No ratings yet
Pandi A Raj 2021
8 pages
Heart Disease Diagnosis Using Data Mining Technique
No ratings yet
Heart Disease Diagnosis Using Data Mining Technique
4 pages
Heart Disease PredictionUsing
No ratings yet
Heart Disease PredictionUsing
6 pages
Integracionk Means NaiveBayes
No ratings yet
Integracionk Means NaiveBayes
13 pages
Heart Disease Prediction Using Data Mining Techniques: Journal of Analysis and Computation (JAC)
No ratings yet
Heart Disease Prediction Using Data Mining Techniques: Journal of Analysis and Computation (JAC)
8 pages
Decision Support in Heart Disease Prediction System Using Naive Bayes
No ratings yet
Decision Support in Heart Disease Prediction System Using Naive Bayes
7 pages
TNCAB-2019 Paper 16
No ratings yet
TNCAB-2019 Paper 16
19 pages
Tnacab-2019 Paper 16
No ratings yet
Tnacab-2019 Paper 16
19 pages
4 Analysis of Heart Disease
No ratings yet
4 Analysis of Heart Disease
6 pages
Lung Disease Prediction System Using Naive Bayes and K Means Clustering
No ratings yet
Lung Disease Prediction System Using Naive Bayes and K Means Clustering
36 pages
Seminarreport
No ratings yet
Seminarreport
15 pages
Analysis of Heart Disease Using in Data Mining Tools Orange and Weka
No ratings yet
Analysis of Heart Disease Using in Data Mining Tools Orange and Weka
7 pages
Disease Prediction Using Data Mining
No ratings yet
Disease Prediction Using Data Mining
5 pages
AB Report Group 2
No ratings yet
AB Report Group 2
14 pages
Heart Disease Prediction Using Data Mining
No ratings yet
Heart Disease Prediction Using Data Mining
3 pages
Survey of Heart Disease Prediction Based On Data Mining Algorithms Ijariie1844
No ratings yet
Survey of Heart Disease Prediction Based On Data Mining Algorithms Ijariie1844
5 pages
Intelligent Heart Disease Prediction System Using Data Mining Techniques
No ratings yet
Intelligent Heart Disease Prediction System Using Data Mining Techniques
7 pages
Heart_Disease_Prediction_using_Machine_Learning_Te
No ratings yet
Heart_Disease_Prediction_using_Machine_Learning_Te
7 pages
Intelligent Heart Disease Prediction System Using Data Mining Technique
No ratings yet
Intelligent Heart Disease Prediction System Using Data Mining Technique
6 pages
6245e19c618b73 12171037
No ratings yet
6245e19c618b73 12171037
9 pages
Heart Attack Prediction Full Document
No ratings yet
Heart Attack Prediction Full Document
28 pages
Decision Tree Algorithms For Prediction of Heart Disease: Srabanti Maji and Srishti Arora
No ratings yet
Decision Tree Algorithms For Prediction of Heart Disease: Srabanti Maji and Srishti Arora
8 pages
Diagnosis of Heart Disease Using Data Mining Algorithm
No ratings yet
Diagnosis of Heart Disease Using Data Mining Algorithm
3 pages
Irjet V6i31160
No ratings yet
Irjet V6i31160
7 pages
07 Dr. S. Anitha
No ratings yet
07 Dr. S. Anitha
9 pages
Heart Disease Prediction Using Naive Bayes and K-Means Techniques
No ratings yet
Heart Disease Prediction Using Naive Bayes and K-Means Techniques
5 pages
Prediction of Heart Disease Using A Hybrid Technique in Data Mining Classification
No ratings yet
Prediction of Heart Disease Using A Hybrid Technique in Data Mining Classification
3 pages
Data Mining Approach To Detect Heart Dieses: Authors
No ratings yet
Data Mining Approach To Detect Heart Dieses: Authors
11 pages
Prediction Heart Disease
No ratings yet
Prediction Heart Disease
11 pages
Updated FACILITATE LEARNING SESSION SCRIPT
100% (2)
Updated FACILITATE LEARNING SESSION SCRIPT
9 pages
Heart Disease Prediction
No ratings yet
Heart Disease Prediction
9 pages
An Analysis of Heart Disease Prediction
No ratings yet
An Analysis of Heart Disease Prediction
4 pages
Geotechnical Eng G Besavilla Pages 1 245
No ratings yet
Geotechnical Eng G Besavilla Pages 1 245
142 pages
Thesis Task 1
No ratings yet
Thesis Task 1
4 pages
Samsung Interview Experience - Free Online Questions and Answers PDF
No ratings yet
Samsung Interview Experience - Free Online Questions and Answers PDF
8 pages
Spline Program Ver 1.1
100% (1)
Spline Program Ver 1.1
7 pages
Crash 2024 8 14 17-5-19
No ratings yet
Crash 2024 8 14 17-5-19
28 pages
Intro To CUDA
No ratings yet
Intro To CUDA
76 pages
Ip Office Installation
No ratings yet
Ip Office Installation
420 pages
Hardware Platform Trends
No ratings yet
Hardware Platform Trends
6 pages
Dolby PRM-4200 Remote Users Guide
No ratings yet
Dolby PRM-4200 Remote Users Guide
41 pages
3182024193131123p2 Assignment 5 - September Final
No ratings yet
3182024193131123p2 Assignment 5 - September Final
9 pages
Blast Gen4 Product Guide
No ratings yet
Blast Gen4 Product Guide
15 pages
Merriam Webster Coursework
100% (2)
Merriam Webster Coursework
6 pages
International Conference On IoT & Information Security (IOTSEC 2024)
No ratings yet
International Conference On IoT & Information Security (IOTSEC 2024)
3 pages
Muhammad+AQEEL+Resume1
No ratings yet
Muhammad+AQEEL+Resume1
1 page
Stellarium
100% (1)
Stellarium
6 pages
Part 01
No ratings yet
Part 01
5 pages
1.9-A Binaryzation
No ratings yet
1.9-A Binaryzation
2 pages
UnofficialTranscript
No ratings yet
UnofficialTranscript
5 pages
Adarsh Report Front - 073101
No ratings yet
Adarsh Report Front - 073101
8 pages
CCS Week 4 CBLM
No ratings yet
CCS Week 4 CBLM
6 pages
U-3, Microbiology, Carewell Pharma
No ratings yet
U-3, Microbiology, Carewell Pharma
12 pages
Odd Man Out and Series
No ratings yet
Odd Man Out and Series
4 pages
The Faye Wallpaper & Flooring Set 1.21 Update
No ratings yet
The Faye Wallpaper & Flooring Set 1.21 Update
1 page
Microsoft Office Enterprise 2010 Corporate Final (Full Activated)
50% (4)
Microsoft Office Enterprise 2010 Corporate Final (Full Activated)
2 pages
Reference Data Types in Java
No ratings yet
Reference Data Types in Java
3 pages
Good Documentation and Quality Management Principles
No ratings yet
Good Documentation and Quality Management Principles
11 pages
Comparison of CDMA and GSM
No ratings yet
Comparison of CDMA and GSM
4 pages
Data Sheet 6GK7443-1EX30-0XE0: Product Type Designation CP 443-1
No ratings yet
Data Sheet 6GK7443-1EX30-0XE0: Product Type Designation CP 443-1
3 pages
Security Specialization Detail Sheet - EN
No ratings yet
Security Specialization Detail Sheet - EN
5 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

NB 7

Uploaded by

NB 7

Uploaded by

Rucha Shinde et al, / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol.

6 (1) , 2015, 637-639

An Intelligent Heart Disease Prediction System

Abstract—Nowadays people work on computers for hours

It assumes the presence or absence of a particular VIII. CONCLUSION

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.