Detection and Classification of Lung Diseases

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Journal of

Computer Science and Software Development

Research Article Open Access

Detection and Classification of Lung Diseases using Machine and Deep


Learning Techniques
Syamala KPL*, Niharika CS, Jenny AM, Pavani P
Department of Computer Science and Engineering, Kalasalingam Academy of Research and Education, Krishnankoil, Tamil-
Nadu, India

Received Date: January 09, 2023 Accepted Date: February 09, 2023 Published Date: February 12, 2023

Corresponding author: Syamala KPL, Dept of Computer Science and Engineering, Kalasalingam Academy of Research and
*

Education, Krishnankoil, TamilNadu, India, Email: poojitakkr@gmail.com

Citation: Syamala KPL, Niharika CS, Jenny AM, Pavani P (2023) Detection and Classification of Lung Diseases using Machine
and Deep Learning Techniques. J Comput Sci Software Dev 2: 1-10

Abstract

The change in the environment, pollution, and some unwanted daily habits, such as smoking, drinking, etc., can lead to
many lung diseases, which need early detection. As a result of smoking, smokers and surrounding people are infected with
lung diseases, especially by breathing problems. This paper proposes a website that takes the symptoms of the patient and
determines if any disease is present and gives a grade that indicates how severe or moderate the disease is. In the case that
the user has an x-ray image and wants a cross-verification, he or she can upload the image and view the results. Thus, the
user input can be a string or an image. We focus on detecting chronic lung disease in an early stage, which, in turn, enhances
the chances of recovery and survival. Our paper contains a hypothesis that utilizes deep learning and machine learning to
predict diseases such as COVID-19, Tuberculosis, Pneumonia, and COPD. For Covid-19 and COPD, we achieved accuracy
of 96.90%, 90.32% respectively using classification algorithms, and for image dataset, we obtained accuracy of 98.58% using
EfficientNet B0, a deep learning algorithm.

Keywords: X-ray, Dataset, Machine Learning, Classification, Efficient Net B0, Deep Learning

©2023 The Authors. Published by the JScholar under the terms of the Crea-
tive Commons Attribution License http://creativecommons.org/licenses/
by/3.0/, which permits unrestricted use, provided the original author and
source are credited.

JScholar Publishers J Comput Sci Software Dev 2023 | Vol 2: 102


2

Introduction deaths each year and are straining healthcare systems. A timely
diagnosis is imperative for enhancing long-term survival and
Lung disease refers to a variety of medical conditions improving the chance of recovery.
that cause the lungs to work inefficiently. The most common
lung diseases are Asthma, COPD, Hypertension, Lung cancer, There are several causes of lung diseases including
Pneumonia, Tuberculosis, Pulmonary edema. Among these smoking, alcohol consumption, pollution. COPD affects 65 mil-
diseases, in this paper we forecasted Pneumonia, Tuberculosis, lion people worldwide and kills 3 million people each year, mak-
Covid-19 and COPD using Machine Learning and Deep Learning ing it the third most common cause of death. Almost 15 percent,
techniques. Across the globe, lung diseases cause millions of or roughly one in seven, middle-aged, older, and adults have lung
disease.

Figure 1: Causes of Lung Diseases

In Covid times recently, having a chronic lung disease as well as EfficientNet B0, a deep learning algorithm used to
meantyou were at high risk for severe illness and complications, detect diseases based on chest X-ray images. We used different
also causing deaths. One-fourth of covid-19 cases involve an algorithms to detect the disease, among them wechoose a
infection that affects both lungs. best model with good accuracy rate. The development of deep
learning technology on medical images,such as Chest X-rays, has
In this paper we present machine learning algorithms shown great potential for detectinglung disease.
to determine the severity of diseases based on their symptoms,

Figure 2: Different Types of Lung Diseases

JScholar Publishers J Comput Sci Software Dev 2023 | Vol 2: 102


3
Problem Statement COPD

In this project by using both Machine Learning and Our COPD dataset include 101 patients and 24 vari-
Deep Learning to take best features by combining the processing ables. There is information on their characteristics variables such
of patient information with data from symptoms as information as AGE, GENDER and Smoking, disease severity, and co-mor-
and from X-ray images, using EfficientNetB0 as a well-trained bidities. It’s also having measures of their walking ability, quality
model, to predict patient has a lung disease. As technology in- of life, and anxiety and depression. The different stages of COPD
creases and world is changing so fast that the pressure on health in the dataset are taken as Gold1 to Gold4 as Mild, Moderate,
is rapidly increasing due the changes in environment and climate Severe and VerySevere
which increased the risk of disease for people. One of the issues
will be focused in this paper i.e., Lung diseases. This application Tuberculosis
is applied before the treatment of patient in health care systems
and in addition patient information can provide better service The tuberculosis dataset consists 16 columns with
during the treatment. symptoms such as fever, coughing blood, chestpain, night sweats,
weight loss etc., In Gender column male and female are indicated
Database as 0 refers to women and 1 to men.

In this paper, we provide the work of experimental COVID


analysis of the proposed model on various popular lungdiseases
datasets, such as COVID, TUBERCULOSIS,COPD DATASETS In COVID dataset has different symptoms as Cough,
and CHEST X-RAY DATASET. This project uses different types Fever, Sore Throat, Shortness of Breath, Headache, persons with
of diseases datasets from Kaggle. Before moving into the results, age above 60 and above, and Gender. The categorical data is pre-
we are going to give a brief overview of our datasets. processed to convert to numerical data.

Figure 3: Depicts the detail information of dataset

Image dataset

The dataset has a total of 7135 x-ray images are present,


which includes four different diseases as subfolders under train,
test and Val, the subfolders for each image category as Normal,
Pneumonia, covid-19, Tuberculosis. The EfficientNetB0 tech-
nique is used to detect and classify the disease from x-ray images

JScholar Publishers J Comput Sci Software Dev 2023 | Vol 2: 102


4

Figure 4: Some Sample pictures of the Image Dataset

Proposed Methodology For Image dataset we use activation function as ReLU for input
layers and SoftMax for output layer.
In our model we worked on both csv dataset which con-
tains disease symptoms and image dataset which consists of ra- In our model we built a website using streamlit a open-
diology images, for csv dataset we worked on 3 types of diseases source python library used for creating and sharing webapps for
such as Covid, Tuberculosis and COPD. In that for each dataset Machine Learning projects. This website will be in private and it
we use various algorithms such as decision trees for both Covid, should be deployed in Heroku to make it as public. In the website
kmeans Clustering for Tuberculosis respectively. For Image data- the disease is predicted as positive or negative based on the input
set we use EfficientNetB0 which is pre-defined ImageNet model. given by the patients.

Figure 5: Steps to follow for Building CNN to data

JScholar Publishers J Comput Sci Software Dev 2023 | Vol 2: 102


5

Implementation Second step to train the images by labeling the columns


and the filepaths and taking the target size.
Import all the required libraries for building the model.
For numerical calculations import NumPy, for reading csv file Third step is to design an neural network model by ini-
import Pandas and machine learning algorithms as KNeighbor- tializing the Efficient Net B0 model which is used for creating the
Classifier, DecisionTreeClassifier for predicting the model, Keras deep learning models for improving theefficiency and accuracy
for developing and evolution of deep learning models and we
will import the dataset. Here, we also import some layers, some In this model Efficient net b0 model is connected to dif-
Keras library like dense,Conv2D, Maxpooling2D, Flatten, Drop- ferent input layers such as Normalization, ZeroPadding2D, Batch
out & keras applications as EfficientNetB0 Fig 9 shows the steps Normalization, Conv2D all these layers are used to normalize
to follow for building CNN model. the output of the previous layers. In the output layer, the SoftMax
function serves as an activation function.The loss function used
In the dataset there are four types of diseases they are as is categorical cross entropy for multi- class classification for giv-
Covid, Pneumonia, Tuberculosis and without any disease. After ing two or more output labels. The optimizer used is the Adam
importing the images dataset, first step is to preprocess the data which is a stochastic gradient descent method for training the
by creating a Data Frame with the filepath and the labels of the model.
pictures.

Figure 6: The concept of Convolutional Neural Network (CNN)

Figure 7: ReLu Activation Function

JScholar Publishers J Comput Sci Software Dev 2023 | Vol 2: 102


6

Figure 8: ReLu Activation Function Derivative

Figure 9: Steps to follow for Building CNN to data

The following are the layers of CNN: the number of parameters and computations, thereby shortening
the training time and reducingoverfitting.
Convolutional Layer: Layers of convolution are used
to retrieve images features, which are edges, intersection points, Dense layers: CNN’s bottom layer is the convolution
giving rich information. The number of layers matters here. layer, the layer which takes all the feature data produced by the
convolution layers and analyzes it.
We can change the architecture by using different acti-
vation functions with different numbers of features. Dropout Layers: For preventing overfitting of irregural
turns of neurons in deep learning networks, we will have many
The network will include the following components: weight parameters and bias parameters.

Activation functions: Our model includes the Relu In the dropout layer, we select specific features from the
andSoftmax activation functions that are applied to all theout- input layer and a specific set of neurons from the hidden layer,
put layers. according to the p value. Some neurons and features are deacti-
vated and others are activated.
Pooling Layers: A convolution layer is added after it,
performing continuous dimensionality reduction i.e., reducing Dropout ratio:0 ≤ p ≤ 1

JScholar Publishers J Comput Sci Software Dev 2023 | Vol 2: 102


7

Batch Normalization: The method is used for training to the disease then it predicts the whether the disease is there or
neural network models. As the batch normalization increases, not based on the algorithms that are fitted with different models.
the epochs required to train the deep neural network model de- In other way, radiology images are given as the input where the
crease. Each layer of the neural network can learn independently, model is trained with the efficientnetb0 and fitted with differ-
enabling faster training. ent activation layers to predict the disease with the best accuracy
for the images as dataset. Finally, the website is an interface that
To make the model, we use Adam as the optimizer, loss would be helpful for patients to predict the disease by the symp-
as categorical cross-entropy, and metrics as accuracy. Afterbuild- toms with a good accuracy and efficiently.
ing and compiling the model, the data is split into training data
and validation data. Our model takes the batch size as 32 with 15 Experiment Results
epochs.
We use both Deep Learning and Machine Learning
After completion of training, we evaluate the model and Algorithms to predict various lung diseases such as COVID-19,
calculate the loss and accuracy. Tuberculosis, Pneumonia, and COPD. For Symptoms dataset i.e.,
CSV Data, For Covid-19 and COPD we got accuracies of 96.90%,
To predict the disease for symptoms by using we use 90.32% respectively using classification algorithm i.e., Decision
the machine learning algorithms. Firstly, import the NumPy and Trees. And ForTuberculosis we use K-Means Clustering.
Pandas for linear algebra and data processing and readingthe
csv file. Then preprocessed the data and changed the categorical For Image Dataset, Which Contains Radiology Imag-
data to the numerical data. Secondly, split the data to train, test es of Lung Diseases like Covid, Pneumonia, Tuberculosis and
sets. Finally, the fit model into the algorithms. The algorithms we COPD we obtained accuracy of 98.58% using EfficientNet B0, a
used are KNN algorithmsand Decision Tree Classifier which are deep learning algorithm. By Using Streamlit Package We build a
supervised machine learning algorithm used for solving both the website which helps the User/Patient in detecting chronic lung
regression and classification problems. A website is developed disease in an early stage, which in turn, enhances the chances of
where we can predict the disease by using both symptoms and recovery and survival.
radiology images. The symptoms of the patient is givenaccording

Figure 10: User Interface of website

JScholar Publishers J Comput Sci Software Dev 2023 | Vol 2: 102


8

Figure 11: Prediction via Symptoms

Figure 12: Prediction via Radiology Images

In this project we tried to develop a website for predict- helps in the prior treatment for the patient. We used both ma-
ing and classifying lung diseases with a better accuracy which chine learning and deep learning algorithms for better classifi-
cation and prediction.

JScholar Publishers J Comput Sci Software Dev 2023 | Vol 2: 102


9

Figure 13: Heat Map for COPD Dataset model

Table 1: Accuracy Results

Disease Model Accuracy


Covid Decision Trees 96.90%
Tuberculosis K-Meansclustering Error rate-0.33
COPD Decision Trees 90.32%
Radiology images(Covid, Pneumonia
Tuberculosis) EfficientNetB0 Train-95.58%Test-86.18%

Conclusion images to predict the severity of the disease. We initially build


a model and trained it such that it is capable of detecting and
The main aim of our research paper is to build a web- classifying the images and the symptoms in real-time. The model
site to predict different types of lung diseases using symptoms gave good accuracy but takes much time to train the data.
and x-ray images of patients in real-time manner. The work done Hereby, I conclude this paper by hoping that you got a fairknowl-
by us made the model to work in a better way such that it can edge, idea and understood the whole concept of designing the
predict the diseases using different patients x-ray images. In our predicting lung disease using ML and DL in real-time, by using
model we use Machine Learning algorithms as Decision Trees, pre-trained models like Efficient Net B0 for better performance
clustering for the prediction of symptoms and Keras which is a of models.
python library helps in deeplearning model, tensor flow for x-ray

JScholar Publishers J Comput Sci Software Dev 2023 | Vol 2: 102


10

References

1. Shimpy Goyal, Rajiv Singh (2021) Detection and classi-


fication of lung diseases for pneumonia and covid-19 using ma-
chine and deep learning techniques.

2. Asmaa Abbas, Mohammed M.Abdelsamea, Mohamed


Medhat Gaber (2020) Classification of Covid-19 in chest x-ray
images using DeTraC deep convolutional neural network.

3. Sema Candemir, Stefan Jaeger, Rahul k.Singh, Kan-


nappan Palaniappan (2013) Lung Segmentation in chest Radio-
graphs using Anatomical Atlases With Nonrigid Registration.

4. Alexanderos Karargyris, Les Folio, Fiona Callaghan,


Zhiyun Xue. Automatic Tuberculosis Screening Using Chest Ra-
diographs.

5. Stefanus Tao Hwa Kieu, Abdullah Bade, Mohd Hana-


fi Ahmad Hijazi, Hoshang Kolivand (2020) A survey of Deep
Learning for lung disease detection on medical images- State of
the art Taxonomy.

6. Siddhanth Tripathi, Sicnhana Shetty, Somil Jain, Van-


shika Sharma (2021) Lung disease detection using deep learning.

7. Latheesh Mangeri, Gnana Prakasi, Neeraj Puppala Submit your manuscript to a JScholar journal
(2021) Chest diseases prediciton from x-ray images using CNN and benefit from:
model 12. ¶ Convenient online submission
¶ Rigorous peer review
8. Anuradha D.Gunasinghe, Achala C.Aponsa, Harsha ¶ Immediate publication on acceptance
Thirimanna (2020) Early prediction of lung diseases. ¶ Open access: articles freely available online
¶ High visibility within the field
9. Ishan Sen, Ikbal Hossain, Faisal Shakib, Asaduzzaman ¶ Better discount for your subsequent articles
Imran (2020) Depth analysis of lung disease prediction using Submit your manuscript at
machine learning algorithms. http://www.jscholaronline.org/submit-manuscript.php

10. Matthew Zak, Adam Krzyzak (2020) Classification of


lung diseases using deep leanring models”, International Confer-
ence on Computational Science.

11. Anuj Rohilla, Rahul Hooda, Ajay Mittal (2017) TB De-


tection in chest radiograph using deep learning architecture 6.

12. Peng Gang, Jiang Hui, Wei Zeng, S.Stirenko (2018) Deep
leanring with lung segmentation and bone shadow exclusive te-
chinques for chest x- ray analysis of lung cancer”, ICCSEEA.

JScholar Publishers J Comput Sci Software Dev 2023 | Vol 2: 102

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy