Heart Disease Prediction Using Machine Learning
Heart Disease Prediction Using Machine Learning
Over the years we have seen that the spread of disease is a global problem. Accurate
and timely analysis of any health-related problems is essential to treat and treat them
early but with this growing number it becomes more difficult. Machine learning strategies
have been accelerating the field of health with more research and methodologies.
Machine learning strategies can contribute significantly to the process of predicting and
diagnosing diseases early, becoming a field of research and proving to be a viable option
in the health sector.
The research paper focuses on the prediction of diseases caused in Heart, predicting the
likelihood of a person having a specific disease that can lead to heart attack based on
various medical factors and related parameters.
We set up 4 different predictive systems using data set to predict whether a patient might
be diagnosed or not. This diagnosis is a difficult task because it must be done accurately
and effectively. We undertake various machine learning classifications such as SVM and
all to predict and differentiate an infected patient and used a temperature map to analyze
the disease.
The most useful method was used to control how the model could be used to improve the
accuracy of the disease prediction in any individual. The power of the proposed model is
relatively satisfactory and able to predict disease and show good accuracy compared to
the previously used category such as non-abrasive bays etc. So, the silent amount of
pressure has been removed using the given model in identifying opportunities for the
divider to be able to diagnose diseases correctly and efficiently.
v
TABLE OF CONTENTS
2 LITERATURE SURVEY 11
2.1. RELATED WORK 11
3 METHODOLOGY 14
3.1. EXISTING SYSTEM 14
3.2. PROPOSED SYSTEM 14
3.3. OBJECTIVE 14
3.4. SOFTWARE AND HARDWARE REQUIREMENTS 15
3.4.1. SOFTWARE REQUIREMENTS 15
3.4.2. HARDWARE REQUIREMENTS 15
3.4.3. LIBRARIES 15
3.5. PROGRAMMING LANGUAGES 16
3.5.1 PYTHON 16
3.5.2. DOMAIN 18
[Type here]
3.6. SYSTEM ARCHITECTURE 21
3.7. ALGORITHMS USED 23
3.7.1. LOGISTIC REGRESSION 23
3.7.2. K NEAREST NEIGHBOR 23
3.7.3. SUPPORT VECTOR MACHINE
23
3.7.4. RANDOM FOREST
24
3.8. MODULES
25
3.8.1. DATASET COLLECTION
25
3.8.2. TRAIN AND TEST THE MODELS
3.8.3. HYPERPARAMETER TUNNING 25
B. SCREENSHOTS 38
2
LIST OF FIGURES
B.1. DATASET 38
B.3. HISTOGRAM 39
D. PLAGARISM REPORT 51
3
LIST OF ABBREVIATIONS
ABBREVIATIONS EXPANSION
ML Machine Learning
AI Artificial Intelligence
RF Random Forest
UI User Interface
LR Logistic Regression
4
CHAPTER 1
INTRODUCTION
Health care data is generally large in volume and complexity in structure. Records
of a large set of medical data created by medical professionals are available to
analyze and extract important information from it. Machine learning strategies are
able to manage large contents of data and helps to obtain needful information.
This type of Machine learning framework can encourage physicians to take
immediate action so that more patients can receive medication in a shorter period
of time, thus saving a significant number of lives.
This paper covers the predictions of different types of diseases caused in Heart
that causes death using different ML strategies. Heart disease, also known as
cardiovascular disease, encompasses a wide range of cardiovascular conditions
and has been a major cause of death worldwide for the past few decades. It
encompasses a wide range of risk factors as well as the need for time to find
accurate, reliable, and logical ways to make an early diagnosis.
Heart is a vital organ of the human frame. It pumps blood to every part of our
anatomy. If it fails to feature efficiently, then the brain and diverse different organs
will stop operating, and inside few minutes, the man or woman will die. exchange
in lifestyle, paintings related pressure and awful food behavior contribute to the
growth in the price of numerous heart-associated sicknesses. coronary heart
diseases have emerged as one of the most distinguished reasons of dying all
around the global. in step with global fitness employer, heart related sicknesses
are responsible for taking 17.7 million lives every year, 31% of all international
deaths. In India too, coronary heart-associated diseases have emerged as the
main reason of mortality coronary heart sicknesses have killed 1.7 million Indians
in 2016, in accordance to the 2016 global Burden of ailment file, released on
September 15,2017. Heart-associated diseases increase the spending on fitness
care and also lessen the productiveness of a man or woman. Estimates made by
the sector fitness employer (WHO), advocate that India has lost up to $237 billion,
1
from 2005-2015, because of heart-related or cardiovascular diseases. accordingly,
possible and accurate prediction of heart-related sicknesses could be very vital.
medical organizations, all over the international, gather facts on various fitness-
associated troubles. This information may be exploited using numerous machine
mastering strategies to benefit beneficial insights. but the data accrued is very
huge and, often, these statistics may be very noisy. These datasets, which might
be too overwhelming for human minds to realize, may be easily explored using
various device getting to know techniques. hence, those algorithms have end up
very useful, in recent instances, to expect the presence or absence of coronary
heart-related diseases as it should be the usage of records technology inside the
fitness care enterprise is increasing day by day to resource docs in choice- making
activities. It facilitates docs and physicians in ailment management, medicines,
and discovery of styles and relationships amongst analysis information. cutting-
edge procedures to predict cardiovascular hazard fail to discover many folks that
would advantage from preventive treatment, while others acquire pointless
intervention. system-gaining knowledge of offers an opportunity to improve
accuracy by way of exploiting complicated interactions among risk elements. We
assessed whether system-getting to know can enhance cardiovascular danger
prediction. Using a variety of techniques, it was found that the SVM technique
provided the best accuracy among others.
The main purpose of this work is to determine whether patients are diagnosed with
the disease or not by using not only one algorithm in prior. Using database for
finding the disease and using different algorithms to have a better understanding
about the algorithms and analysing the data for the most effective result. It did
provide excellent accuracies for each disease.
Machine learning (ML) is the study of computer algorithms that can improve
automatically through experience and by the use of data. It is seen as a part
of artificial intelligence. Machine learning algorithms build a model based on sample
data, known as training data, in order to make predictions or decisions without being
explicitly programmed to do so. Machine learning algorithms are used in a wide
2
variety of applications, such as in medicine, email filtering, speech recognition,
and computer vision, where it is difficult or unfeasible to develop conventional
algorithms to perform the needed tasks.
Despite the fact that the reasons mentioned are valid, we have added a dimension
in the last decade where data is being utilized for predicting what could potentially
happen in the future. Then comes Machine Learning which play a significant role
in doing so. Machine learning is a subset/subfield of Artificial Intelligence.
Generally, the main aim of Machine learning is to understand the structure of data
and apply the best possible models that can be utilized or identify a hidden
pattern. Developing a machine learning model is one of the key factors in
predicting a future problem which again requires machine learning algorithms.
There are numerous machine learning algorithms that have been developed and
mature enough to solve various real-world business problems.
Using Machine learning, information is being turned into knowledge. In the last 5-6
decades, enormous data has been recorded or collected which will be of no use if
we don’t utilize or analyze to find hidden patterns. In order to find useful and
significant patterns with complex data, we have several Machine Learning
techniques available to ease our struggle for discovery. Subsequently, those
identified hidden patterns and knowledge of the problem can be helpful to perform
complex decision making and predict future occurrence.
3
American IBMer and pioneer in the field of computer gaming and artificial
intelligence. Also, the synonym self-teaching computers was used in this time
period. A representative book of machine learning research during the 1960s
was the Nilsson's book on Learning Machines, dealing mostly with machine
learning for pattern classification. Interest related to pattern recognition
continued into the 1970s, as described by Duda and Hart in 1973.In 1981 a
report was given on using teaching strategies so that a neural network learns
to recognize 40 characters (26 letters, 10 digits, and 4 special symbols) from
a computer terminal.
Modern day machine learning has two objectives, one is to classify data
based on models which have been developed, the other purpose is to make
predictions for future outcomes based on these models. A hypothetical
algorithm specific to classifying data may use computer vision of moles
coupled with supervised learning in order to train it to classify the cancerous
moles. Whereas, a machine learning algorithm for stock trading may inform
the trader of future potential predictions.
In machine learning, tasks square measure is typically classified into broad classes.
These classes square measure supported however learning is received or however,
feedback on the education is given to the system developed. Two of the foremost
wide adopted machine learning strategies are square measure supervised learning
that trains algorithms supported example input and output information that's tagged
by humans, and unattended learning that provides the algorithmic program with no
tagged information to permit it to search out structure at intervals its computer file.
Machine learning approaches are traditionally divided into three broad categories,
depending on the nature of the "signal" or "feedback" available to the learning
system:
Supervised learning: The computer is presented with example inputs and their
4
desired outputs, given by a "teacher", and the goal is to learn a general rule that
maps inputs to outputs.
Unsupervised learning is usually used for transactional information. You will have
an oversized dataset of consumers and their purchases, however, as a person,
you'll probably not be able to add up what similar attributes will be drawn from
client profiles and their styles of purchases.
With this information fed into the Associate in Nursing unattended learning rule, it
should be determined that ladies of a definite age vary UN agency obtain
unscented soaps square measure probably to be pregnant, and so a promoting
campaign associated with physiological condition and baby will be merchandised