Detection and Classification of Lung Diseases
Detection and Classification of Lung Diseases
Detection and Classification of Lung Diseases
Received Date: January 09, 2023 Accepted Date: February 09, 2023 Published Date: February 12, 2023
Corresponding author: Syamala KPL, Dept of Computer Science and Engineering, Kalasalingam Academy of Research and
*
Citation: Syamala KPL, Niharika CS, Jenny AM, Pavani P (2023) Detection and Classification of Lung Diseases using Machine
and Deep Learning Techniques. J Comput Sci Software Dev 2: 1-10
Abstract
The change in the environment, pollution, and some unwanted daily habits, such as smoking, drinking, etc., can lead to
many lung diseases, which need early detection. As a result of smoking, smokers and surrounding people are infected with
lung diseases, especially by breathing problems. This paper proposes a website that takes the symptoms of the patient and
determines if any disease is present and gives a grade that indicates how severe or moderate the disease is. In the case that
the user has an x-ray image and wants a cross-verification, he or she can upload the image and view the results. Thus, the
user input can be a string or an image. We focus on detecting chronic lung disease in an early stage, which, in turn, enhances
the chances of recovery and survival. Our paper contains a hypothesis that utilizes deep learning and machine learning to
predict diseases such as COVID-19, Tuberculosis, Pneumonia, and COPD. For Covid-19 and COPD, we achieved accuracy
of 96.90%, 90.32% respectively using classification algorithms, and for image dataset, we obtained accuracy of 98.58% using
EfficientNet B0, a deep learning algorithm.
Keywords: X-ray, Dataset, Machine Learning, Classification, Efficient Net B0, Deep Learning
©2023 The Authors. Published by the JScholar under the terms of the Crea-
tive Commons Attribution License http://creativecommons.org/licenses/
by/3.0/, which permits unrestricted use, provided the original author and
source are credited.
Introduction deaths each year and are straining healthcare systems. A timely
diagnosis is imperative for enhancing long-term survival and
Lung disease refers to a variety of medical conditions improving the chance of recovery.
that cause the lungs to work inefficiently. The most common
lung diseases are Asthma, COPD, Hypertension, Lung cancer, There are several causes of lung diseases including
Pneumonia, Tuberculosis, Pulmonary edema. Among these smoking, alcohol consumption, pollution. COPD affects 65 mil-
diseases, in this paper we forecasted Pneumonia, Tuberculosis, lion people worldwide and kills 3 million people each year, mak-
Covid-19 and COPD using Machine Learning and Deep Learning ing it the third most common cause of death. Almost 15 percent,
techniques. Across the globe, lung diseases cause millions of or roughly one in seven, middle-aged, older, and adults have lung
disease.
In Covid times recently, having a chronic lung disease as well as EfficientNet B0, a deep learning algorithm used to
meantyou were at high risk for severe illness and complications, detect diseases based on chest X-ray images. We used different
also causing deaths. One-fourth of covid-19 cases involve an algorithms to detect the disease, among them wechoose a
infection that affects both lungs. best model with good accuracy rate. The development of deep
learning technology on medical images,such as Chest X-rays, has
In this paper we present machine learning algorithms shown great potential for detectinglung disease.
to determine the severity of diseases based on their symptoms,
In this project by using both Machine Learning and Our COPD dataset include 101 patients and 24 vari-
Deep Learning to take best features by combining the processing ables. There is information on their characteristics variables such
of patient information with data from symptoms as information as AGE, GENDER and Smoking, disease severity, and co-mor-
and from X-ray images, using EfficientNetB0 as a well-trained bidities. It’s also having measures of their walking ability, quality
model, to predict patient has a lung disease. As technology in- of life, and anxiety and depression. The different stages of COPD
creases and world is changing so fast that the pressure on health in the dataset are taken as Gold1 to Gold4 as Mild, Moderate,
is rapidly increasing due the changes in environment and climate Severe and VerySevere
which increased the risk of disease for people. One of the issues
will be focused in this paper i.e., Lung diseases. This application Tuberculosis
is applied before the treatment of patient in health care systems
and in addition patient information can provide better service The tuberculosis dataset consists 16 columns with
during the treatment. symptoms such as fever, coughing blood, chestpain, night sweats,
weight loss etc., In Gender column male and female are indicated
Database as 0 refers to women and 1 to men.
Image dataset
Proposed Methodology For Image dataset we use activation function as ReLU for input
layers and SoftMax for output layer.
In our model we worked on both csv dataset which con-
tains disease symptoms and image dataset which consists of ra- In our model we built a website using streamlit a open-
diology images, for csv dataset we worked on 3 types of diseases source python library used for creating and sharing webapps for
such as Covid, Tuberculosis and COPD. In that for each dataset Machine Learning projects. This website will be in private and it
we use various algorithms such as decision trees for both Covid, should be deployed in Heroku to make it as public. In the website
kmeans Clustering for Tuberculosis respectively. For Image data- the disease is predicted as positive or negative based on the input
set we use EfficientNetB0 which is pre-defined ImageNet model. given by the patients.
The following are the layers of CNN: the number of parameters and computations, thereby shortening
the training time and reducingoverfitting.
Convolutional Layer: Layers of convolution are used
to retrieve images features, which are edges, intersection points, Dense layers: CNN’s bottom layer is the convolution
giving rich information. The number of layers matters here. layer, the layer which takes all the feature data produced by the
convolution layers and analyzes it.
We can change the architecture by using different acti-
vation functions with different numbers of features. Dropout Layers: For preventing overfitting of irregural
turns of neurons in deep learning networks, we will have many
The network will include the following components: weight parameters and bias parameters.
Activation functions: Our model includes the Relu In the dropout layer, we select specific features from the
andSoftmax activation functions that are applied to all theout- input layer and a specific set of neurons from the hidden layer,
put layers. according to the p value. Some neurons and features are deacti-
vated and others are activated.
Pooling Layers: A convolution layer is added after it,
performing continuous dimensionality reduction i.e., reducing Dropout ratio:0 ≤ p ≤ 1
Batch Normalization: The method is used for training to the disease then it predicts the whether the disease is there or
neural network models. As the batch normalization increases, not based on the algorithms that are fitted with different models.
the epochs required to train the deep neural network model de- In other way, radiology images are given as the input where the
crease. Each layer of the neural network can learn independently, model is trained with the efficientnetb0 and fitted with differ-
enabling faster training. ent activation layers to predict the disease with the best accuracy
for the images as dataset. Finally, the website is an interface that
To make the model, we use Adam as the optimizer, loss would be helpful for patients to predict the disease by the symp-
as categorical cross-entropy, and metrics as accuracy. Afterbuild- toms with a good accuracy and efficiently.
ing and compiling the model, the data is split into training data
and validation data. Our model takes the batch size as 32 with 15 Experiment Results
epochs.
We use both Deep Learning and Machine Learning
After completion of training, we evaluate the model and Algorithms to predict various lung diseases such as COVID-19,
calculate the loss and accuracy. Tuberculosis, Pneumonia, and COPD. For Symptoms dataset i.e.,
CSV Data, For Covid-19 and COPD we got accuracies of 96.90%,
To predict the disease for symptoms by using we use 90.32% respectively using classification algorithm i.e., Decision
the machine learning algorithms. Firstly, import the NumPy and Trees. And ForTuberculosis we use K-Means Clustering.
Pandas for linear algebra and data processing and readingthe
csv file. Then preprocessed the data and changed the categorical For Image Dataset, Which Contains Radiology Imag-
data to the numerical data. Secondly, split the data to train, test es of Lung Diseases like Covid, Pneumonia, Tuberculosis and
sets. Finally, the fit model into the algorithms. The algorithms we COPD we obtained accuracy of 98.58% using EfficientNet B0, a
used are KNN algorithmsand Decision Tree Classifier which are deep learning algorithm. By Using Streamlit Package We build a
supervised machine learning algorithm used for solving both the website which helps the User/Patient in detecting chronic lung
regression and classification problems. A website is developed disease in an early stage, which in turn, enhances the chances of
where we can predict the disease by using both symptoms and recovery and survival.
radiology images. The symptoms of the patient is givenaccording
In this project we tried to develop a website for predict- helps in the prior treatment for the patient. We used both ma-
ing and classifying lung diseases with a better accuracy which chine learning and deep learning algorithms for better classifi-
cation and prediction.
References
7. Latheesh Mangeri, Gnana Prakasi, Neeraj Puppala Submit your manuscript to a JScholar journal
(2021) Chest diseases prediciton from x-ray images using CNN and benefit from:
model 12. ¶ Convenient online submission
¶ Rigorous peer review
8. Anuradha D.Gunasinghe, Achala C.Aponsa, Harsha ¶ Immediate publication on acceptance
Thirimanna (2020) Early prediction of lung diseases. ¶ Open access: articles freely available online
¶ High visibility within the field
9. Ishan Sen, Ikbal Hossain, Faisal Shakib, Asaduzzaman ¶ Better discount for your subsequent articles
Imran (2020) Depth analysis of lung disease prediction using Submit your manuscript at
machine learning algorithms. http://www.jscholaronline.org/submit-manuscript.php
12. Peng Gang, Jiang Hui, Wei Zeng, S.Stirenko (2018) Deep
leanring with lung segmentation and bone shadow exclusive te-
chinques for chest x- ray analysis of lung cancer”, ICCSEEA.