Convolutional Neural Network (CNN) and Federated Learning Based Privacy Preserving Approach For Skin Disease Classification

The Journal of Supercomputing
https://doi.org/10.1007/s11227-024-06309-0
Convolutional neural network (CNN) and federated

learning‑based privacy preserving approach for skin
disease classification
Divya1 · Niharika Anand1 · Gaurav Sharma2
Accepted: 13 June 2024

© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature
2024
Abstract
This research displays inspect a study on the classification of human skin diseases
using medical imaging, with a focus on data privacy preservation. Skin disease
diagnosis is primarily done visually and can be challenging due to variant colors
and complex formation of diseases. The proposed solution involves an image dataset
with seven classes of skin disease, a convolutional neural network (CNN) model,
and image augmentation to increase dataset size and model generalization. The
suggested CNN model attained an average precision of 86% and an average recall
of 81% for all seven classes of skin diseases. To safeguard the privacy of the data, a
federated learning method was used, in which the information was split among 500,
1000, and 2000 users. With the proposed scheme which based on CNN for disease
classification and the federated learning method, the average accuracy was 82.42%,
87.26%, and 93.25% for the different numbers of clients. The findings show that it
may be possible to effectively categorize skin illnesses by employing a CNN-based
approach coupled with federated learning in order to achieve this goal. This would
be conducted without compromising the confidentiality of patient data.
Keywords Skin disease classification · Medical imaging · Convolutional neural

network (CNN) · Federated learning · Data privacy · Data security · Performance
evaluation
1 Introduction
Skin diseases are a widespread and significant health concern globally, with various
factors influencing their impact, including environmental factors and genetic
susceptibility [1]. Also, social factors like socioeconomic status, schooling, free
Niharika Anand and Gaurav Sharma have contributed equally to this work.
Extended author information available on the last page of the article
13
Vol.:(0123456789)
Divya et al.
time, and getting to healthcare play a part in how prevalent skin illnesses are and
how bad they are [2]. As per the study in [3], skin disease were ranked fourth in
the catalog of most plebeian nonfatal diseases globally. Skin diseases can lead to
psychological and sociological issues, such as depression, anger, anxiety, low self-
esteem and even social isolation [4]. Proper identification is crucial for effective
therapy, but doctors have trouble making accurate diagnoses because many skin
diseases look identical in shape and color [5]. Machine learning has revolutionized
medical imaging, enabling more accurate disease detection, and classification, even
in the case of skin diseases. The advancement in processing power and the vast
availability of medical imaging data have contributed to the remarkable performance
of machine learning models in medical science [6]. CNN has shown exceptional
advancements in medical image processing [7]. However, there are issues with
using clinical images for research purposes, such as differing complex contexts,
privacy concerns, and resolutions, especially with sentient images [8]. Moreover,
the skin disease image datasets lack clear labels and information, and the number of
attainable labeled datasets is limited [9]. To overcome these challenges, researchers
have turned to an emerging concept called federated learning. Federated learning
allows data to be distributed across clients, thus preserving user privacy and data
confidentiality [10].
The following is the outline for this paper: The paper begins with a discussion
of the background literature (Sect. 2), moves on to an outline of the proposed
methodology (Sect. 3), describes the Image Dataset (Sect. 4), analyzes the results
(Sect. 5), and concludes with a discussion of the next steps for this research (Sect. 6).
2 Background literature
A variety of scholarly studies have focused on the categorization and identification

of dermatological conditions, employing deep learning algorithms to augment the
levels of precision and accuracy. According to [11], Esteva et al. employed the
inception V3 architecture in 2017 to categorize skin tumors and attained an average
accuracy of 55.4%, demonstrating the effectiveness of such techniques. In the same
vein, in their respective studies, Codella et al.[12], Zhang et al.[13], and Shanthi
et al.[14] employed various machine learning techniques to classify skin diseases
and diabetic retinopathy images. Codella et al. utilized a nonlinear SVM algorithm,
which resulted in an average accuracy of 76% for detecting melanoma, while Zhang
et al. achieved an accuracy of 87.25% for classifying four common skin diseases by
utilizing the inception V3 architecture. Furthermore, Shanthi et al. utilized an altered
AlexNet architecture to classify diabetic retinopathy images, with a maximum
accuracy of 96.6%, as indicated in [14].
Tushabe et al. [15] looked at different machine learning methods for identifying
skin diseases, virus infections, and bacterial infections. They concluded that the
KNN classifier was extremely accurate, with a maximum accuracy of 100%. In
Sheha et al. [16] recommended a classification approach using an MLP classifier
to detect melanoma, which resulted in a testing and training accuracy of 92% and
100%, respectively. Moreover, Gurovich et al. [17] introduced the DeepGestalt CNN
13
Convolutional neural network (CNN) and federated…
algorithm, trained on 17,000 face pictures of genetic syndromes. This proposed

algorithm in [17] was able to accurately identify over 200 images of genetic
syndrome. As far as our understanding goes, no skin disease classification model
has been designed with privacy considerations in mind. However, researchers
are currently exploring the feasibility of utilizing federated learning techniques
for medical health applications. Specifically, Olivia et al. [18] have acquainted
a federated learning framework for learning a global model by utilizing locally
scattered health data retained at multiple sites. Also, as shown in [18, 19], Li et al.
have evaluated how feasible it would be to use differential-privacy measures to
safeguard patient data in a federated learning environment. Using the COVID-19
dataset, Boyi et al. [20] showed how to train models employing federated learning
and examined the performance of four different models trained both using and
without the federated learning framework in 2020. These advancements highlight
the potential for enhancing privacy in medical applications of machine learning.
Recent advancements in deep learning have significantly improved the accuracy
of skin disease classification. Liu et al. [4] introduced a deep learning system that
achieved high performance in differential diagnosis of skin diseases, showcasing
the potential of CNNs in medical imaging. Raghu et al. [21] explored transfer
learning techniques, demonstrating their effectiveness in medical applications
and highlighting their ability to enhance model performance with limited data.
Yang et al. [22] proposed an attention-based CNN that improved classification
performance by focusing on relevant image regions, which is crucial for accurate
skin lesion diagnosis. Li et al. [23] developed a dense convolutional network with
self-attention mechanisms, achieving superior results in skin disease classification
by effectively capturing complex patterns. Wang et al. [24] combined deep
learning with attention mechanisms for automated skin lesion segmentation
and classification, achieving state-of-the-art accuracy. Zhang et al. [13] utilized
EfficientNet with transfer learning, demonstrating the model’s efficiency and high
accuracy in skin lesion diagnosis. These studies reflect the rapid advancements in
the field and underscore the importance of incorporating recent methodologies to
enhance the robustness and reliability of classification systems.
CNNs have been particularly influential, demonstrating state-of-the-art
performance in large-scale image recognition challenges. Notable works include
Krizhevsky et al.’s pioneering use of deep CNNs for the ImageNet classification
task [25]. Additionally, transformers, initially designed for natural language
processing, have shown remarkable success in vision tasks through self-attention
mechanisms. Vaswani et al.’s seminal paper introduced the transformer architecture,
highlighting its effectiveness in capturing long-range dependencies in data [26].
More recently, hybrid models that combine the strengths of CNNs and transformers
have emerged, further pushing the boundaries of image classification accuracy. For
instance, Dosovitskiy et al. proposed the vision transformer (ViT), which applies a
pure transformer directly to sequences of image patches [27]. Similarly, Liu et al.
introduced the Swin transformer, which incorporates hierarchical structure and
shifted windows to enhance performance [28]. These advancements underscore
the evolution and diversification of deep learning techniques, establishing them as
industry-standard approaches in the field.
13
Divya et al.
In this research, we have examined the suitability of utilizing CNN for accurately
classifying skin diseases while retaining data privacy by incorporating federated
learning algorithms. Our approach aims to enhance the accuracy and security of
skin disease classification using deep learning techniques.
We present a new CNN model for skin disease classification and increase
its performance through hyper-parameter tweaking. Finally, we compared the
performance of our proposed CNN model to that of other industry-standard standard
techniques.
Through the integration of these methodologies, we have put forth a
comprehensive methodology that effectively improves the precision and safeguarding
of skin disease classification through the use of deep learning techniques. The
anticipated outcome of this study is to make a substantial scholarly contribution
toward the advancement of more precise and reliable models for classifying skin
illnesses. These models have an opportunity to enhance the accuracy of diagnoses
and treatment techniques for various skin conditions. In light of the sensitive nature
of certain skin disease photographs, particularly those that display intimate bodily
parts and organs, our study focused on exploring federated learning methods as a
means to improve the security and confidentiality of medical imaging data present in
the dataset. Our federated learning approach ensured that sensitive medical images
were kept locally on each user’s device, and the model was trained using only the
aggregated information, rather than the raw data. This approach helped to ensure
that the privacy of the sensitive medical images was preserved while still enabling
the model to learn from the dataset as a whole.
The strategy we suggest offers a potential answer to the challenge of insufficient
data on skin diseases that is now accessible for research purposes. In subsequent
periods, the construction of a specialized dataset of photographs of skin diseases
may present a significant research undertaking with considerable worth. Moreover,
the approach we have proposed has the potential to be utilized in several medical
imaging applications that entail the handling of confidential information, as outlined
in Table 1.
3 Methodology
In this part of the article, the methodology that was applied to the research is broken
down and discussed. In this study, a number of distinct CNN methods, such as the
Alexnet and VGG16 models, are being dissected and contrasted with one another. In
addition, we are looking into the concept of a federated learning framework as a part
of our work together.
An input layer, hidden layers, and an output layer make up the structure of a
neural network, which is a type of mathematical organization. The layers are made
up of neurons that are linked together by weight parameters, and these parameters
are optimized using a loss function and the backpropagation method.
CNNs are neural networks with several layers that include convolution,
activation, pooling, and fully connected layers. The proposed topology of a
CNN network is shown in Fig. 1, along with the dimensions and sizes of each
13
Table 1 Comparison of previous work and our contribution
Study Method Dataset size Number of classes Accuracy
Esteva et al. [24] Deep Learning (inception V3 Architecture) 2032 9 55.4%

Codella et al. [12] Nonlinear Support Vector Machine (SVM) 2000 2 76%
Zhang et al. [13] Deep Learning (inception V3 Architecture) 2420 4 87.25%

Shanthi et al. [14] Deep Learning (Altered AlexNet Architecture) 1200 4 96.6%
Tushabe et al. [15] K-Nearest Neighbor N/A 3 KNN: 100%; SVM: 92%
Sheha et al. [16] Deep Learning (Multi-Layer Perceptron Classifier) 2000 2 Training: 100%, Testing: 92%
Gurovich et al. [17] Deep Learning (CNN - DeepGestalt) 17,000 200+ High Precision
Proposed work Deep Learning (Custom CNN Model), Federated Learning 10,015 7 93.23% for 2000 users
13
Divya et al.
Fig. 1 Proposed model for skin disease classification
layer and filter that would be used in the network. With the goal to extract
characteristics from the images that were incorporated into the model that has
been presented, a deep architecture that includes 28 convolutional layers is
deployed. The architecture incorporates depthwise separable convolutions, which
are a combination of depthwise and pointwise convolutions. The initial layer
comprised of 32 filters which are of size 3 × 3, and the number of filters increases
gradually in subsequent layers, reaching a maximum of 1024 in the middle of
the network and then gradually decreasing to 128 filters in the last few layers.
For training, only one fully connected layer with softmax activation is added on
top of the extracted features. The model that he indicated was trained on a large
dataset, and it was optimized with regard to hyperparameters such as batch size
and number of epochs. The results of the investigations show that the suggested
model performs better than existing models that are considered to be state-of-the-
art when it comes to skin disease categorization tasks.
3.1 Convolution layer
Every CNN is comprised of an essential component called the Convolution

Layer. In proposed model, depthwise separable convolution is used instead of
regular convolution. This technique splits the traditional convolution into two
unique layers, which are called the depthwise convolution and the pointwise
convolution, respectively. These terms relate to the layers that are created when
the convolution is performed in the depthwise direction. Pointwise convolution
combines the result of depthwise convolution by performing a 1 × 1 convolution,
whereas depthwise convolution only applies a single filter to each of the input
channels it processes. Pointwise convolution is used to combine the output of
depthwise convolution. Because of the significant reduction in processing time
required by the convolutional layers brought about by this method, it is now
feasible for mobile devices to successfully carry out inference tasks. The feature
matrix is generated, and the dimension of this matrix is determined using Eq. 1,
which takes into account the filter size (S), input image (Im ), stride (St ) and
padding (P).
13
(Im − S + 2 × P)
Output = (1)
St
To bring nonlinearity into the model, the convolution layer output is routed via an
activation function. ReLU (rectified linear unit) and sigmoid are the most widely
utilized activation functions. The activation layer is followed by a pooling layer,
which is responsible for downsampling the feature maps that were formed by the
preceding convolution layer. The purpose of this layer is to reduce the amount of
space occupied by the feature maps while preserving the essential characteristics of
those maps. In addition to this, the pooling layer assists in minimizing the risk of
model overfitting.
In the final stage, the outcome of the pooling layer is passed into a fully
connected layer, which is a conventional neural network that accepts a one-
dimensional array as its input, generates class-wise probabilities, then predicts
the correct class. In other words, it takes the output of the pooling layer as its
input. Table 2 provides a synopsis of the nine layers that were added to the CNN
model that was introduced. Figure 1 depicts the architecture that comprises the
model, and it also includes information on the dimensions and sizes of each
of the layers and filters. In the quantitative study that is given in this section, a
comparison is made between the proposed CNN model and existing algorithms
as AlexNet, VGG16, ResNet50 and DenseNet121. This part also discusses the
concept of a federated learning framework, which is implemented in the research
project that is being suggested in order to improve the safety of medical imaging
through the application of a specialized dataset.
Table 2 Proposed CNN model Layer Size Filter size Stride Activation
summary
Input 224 × 224 × 3 – – –
Conv1 112 × 112 × 32 3×3 2 ReLu
Conv2 112 × 112 × 64 3×3 1 ReLU
Conv3 56 × 56 × 128 3×3 2 ReLU
Conv4 56 × 56 × 128 3×3 1 ReLU
Conv5 28 × 28 × 256 3×3 2 ReLU
Conv6 28 × 28 × 256 3×3 1 ReLU
Conv7 14 × 14 × 512 3×3 2 ReLU
Conv8 14 × 14 × 512 3×3 1 ReLU
Conv9 7 × 7 × 1024 3×3 2 ReLU
Conv10 7 × 7 × 1024 3×3 1 ReLU
Global pooling 1 × 1 × 1024 – – –
Output 1×1×7 – – Softmax
13
Divya et al.
3.2 Pooling layer
The pooling layer used is depthwise separable average pooling. This layer is applied
after each depthwise separable convolutional layer. The depthwise separable average
pooling layer cut downs the spatial dimensions of the output from layer that came
before it, while retaining all channels. The average pooling operation computes the
mean value of each channel over a certain window size, resulting in a feature map
with reduced spatial dimensions and the same number of channels. The depthwise
separable average pooling layer helps to prevent overfitting while also contributing
to the reduction in the total number of parameters used in the model.
The pooling layer processes the feature matrix using a filter to create a dimension-
reduced feature matrix. This study employs two types of pooling layers: maximum
pooling and average pooling. In maximum pooling, each set’s new feature matrix
is generated by selecting the ceiling value of every patch filtered from the original
feature matrix. By calculating the average values of each patch that the filter selects,
average pooling creates new set values in contrast. Equation 2 is accustomed to
determine newly generated feature matrix dimensions, where H, W, and C denote
the height, width, and number of channels in the feature map.
(H − S + 2) (W − S + 1)
Output = × ×C (2)
(St ) (St )
The input feature map is divided into non-overlapping subregions, and the maximum
value of each subregion is taken to compose a fresh feature map with reduced
dimensions.
In this process, the filter or window slides over the input feature map, and at
each position, it selects a subregion. The size of the subregion is determined by the
filter size, which is typically 2 × 2 or 3 × 3. For each subregion, the ceiling value is
selected and placed in the corresponding location of the new feature map. The new
feature map is smaller than the primary feature map, as each subregion produces
only one value instead of multiple values.
The input feature map has a size of 4 × 4 and a depth of 3 (indicated by the three
planes). The filter size is 2 × 2, and the stride is 2. Therefore, the filter slides over
the feature map in steps of 2, resulting in a new feature map with a size of 2 × 2 and
a depth of 3.
Max-pooling is an effective way to contract the dimensionality of the feature map
and also engaging the majority of crucial features. It helps to avoid overfitting and
reduces the computational cost of the model.
3.3 Activation layer
The suggested CNN model’s activation layer is in charge of combining a neuron’s

excitation into an activation function, which is a critical component of artificial neural
network (ANN) that aids in learning of complicated patterns of data. This function
selects which impulses are sent to the network’s next neuron. There are various
13
prominent activation functions used in deep learning models, including ReLu, Sigmoid,
and Tanh. The suggested model in this study employs an activation function given by
equation 3. This function returns zero for inputs less than or equal to zero and returns
the input for inputs larger than zero.
{
0 if y ≤ 0
f (y) =
y if y > 0 (3)
3.4 Fully connected layer
The suggested design includes a fully connected layer, which is a conventional neural
network layer. This layer computes class probabilities using the output of the layers
that came before it in the form of a one-dimensional vector and takes it as input. The
final categorization is carried out by this layer, which is located at the very end of
the network and is responsible for its execution. The number of parameters is kept at
a minimum and overfitting is avoided with the help of the proposed model’s global
average pooling layer, which comes before the fully connected layer. A single feature
map will be generated for each channel as a result of this pooling layer’s calculation
of an average of the feature maps generated by the layer that came before it across all
of the spatial dimensions. The resulting tensor is then sent on to the fully connected
layer, which generates the final output by computing the probabilities associated with
the various classes. In the case of picture classification, the number of output classes
is often the same as the number of different object categories. Therefore, the degree of
nodes in the fully connected layer is equal to the total variety of output classes.
3.5 Federated learning
The use of federated learning has emerged as a method for maintaining data privacy
while reducing latency by training decentralized data. This strategy entails sending
a central model copy to all of the devices in use, and then training the models with
the user input data collected from each individual device. After that, the results of
the training are transferred to the server, where they are aggregated, and the primary
model is given an update. The diagram labeled Fig. 2 presents an example of the
federated learning system concept. The implementation of federated learning has a
significant impact on data confidentiality and security, particularly for medical data.
In 2019, Augenstein et al. [29] constructed a fruitful framework centered around
federated learning in order to address the typical issues regarding data in situations
when it is not possible to access the data directly.
4 Image dataset
For this study, a dataset of skin disease images was created using images from
the HAM10000 dataset [30], which contains a diverse range of skin lesions. This
dataset consists of 10,000 dermoscopic images that have been manually annotated
13
Divya et al.
Fig. 2 Framework for the proposed model
by dermatologists with one of seven different diagnostic categories. The categories

include actinic keratosis, basal cell carcinoma, benign keratosis, dermatofibroma,
melanoma, melanocytic nevi and vascular lesions.
Figures 3 and 4 show a variety of picture data samples as well as the distribution
of the dataset into training and validation data for each class. Furthermore, in the
federated learning approach employed in this work, data are deemed non-id, and the
complete dataset is distributed at random across the clients, assuring data privacy
and security.
Fig. 3 Dataset distribution into training data and validation data
13
Fig. 4 Sample images from dataset
Overall, this dataset provides a diverse and high-quality set of images for training
and testing machine learning models for skin disease classification.
5 Result and discussion
In this study, a total of 9077 images were considered for training the proposed
model and 938 images were utilized for validation. To improve the performance of
the model and to prevent overfitting, image augmentation techniques were applied
during the training process. Specifically, techniques such as rotation, horizontal and
vertical flipping, and zooming were used to increase the diversity of the training
images. The training process was carried out for 30 epochs with a batch size of 50.
The deep learning API used in this study was Keras with GPU, which enabled the
training process to be faster and more efficient. The training procedure produced a
set of results, which were then evaluated using an assortment of metrics, comprising
precision, recall, accuracy, precision and F1-score, all of which are going to be
addressed in the subsequent sections.
The proposed model architecture is designed to effectively learn and represent
features from the input images. It includes multiple convolution layers that employ
depthwise separable convolutions. This specific kind of convolution is comprised
of a depthwise convolution that employs a single filter to each and every incoming
channel, with a pointwise convolution, that employs a 1 × 1 filter for combining
the results of the depthwise convolution. Both of these filters are applied in order
to create the final result. The model is able to drastically decrease the variety of
trainable parameters and the amount of computational complexity by making use of
depthwise separable convolutions. Despite this, it is still capable of achieving a high
level of accuracy. This makes it easier for the model to learn and extract features
from the images that are provided in an efficient manner, which is vital for the
proper classification of skin diseases. The use of such advanced techniques is an
13
Divya et al.
important aspect of this research as it has contributed significantly to achieving the

desired results. Table 3 provides the results of the proposed model’s performance in
classifying various skin diseases.
For each disease category, the table shows the precision, recall, and F1-score,
along with the support. The model performs well for Melanocytic nevi, with a
precision of 0.94, recall of 0.91, and F1-score of 0.92. However, it struggles with
some of the other categories such as Dermatofibroma and Benign keratosis-like
lesions, with low precision and recall scores. The average/total values show the
overall performance of the model, with an average precision of 0.524, recall of
0.579, and an F1-score of 0.483 over all disease categories.
Each entry in the confusion matrix in Fig. 5 represents the number of images
that have been classified into a particular category. For example, the entry in row 1,
column 2 represents the number of images that are actually “Actinic keratoses” but
have been predicted as “Basal cell carcinoma.”
By analyzing the confusion matrix in Fig. 5, we can see that the model performed
well in identifying “Melanocytic nevi” with 685 out of 751 correctly classified,
resulting in a high recall of 0.91 and precision of 0.94. However, it struggled
with identifying “Dermatofibroma” with only three out of six correctly classified,
resulting in low recall of 0.50 and precision of 0.10. In the future, we are going to
work on improving the overall performance of the model through all seven classes
of photographs, particularly for the “Dermatofibroma” category. Table 4 presents a
comparative performance analysis of various models, including AlexNet, VGG16,
ResNet50, DenseNet121, and our proposed method. The accuracy, precision, recall,
and F1-Score metrics are reported to provide a comprehensive evaluation. Notably,
the proposed method achieves the highest performance across all metrics, with
an accuracy of 91.25%, precision of 91.60%, recall of 91.20%, and an F1-Score
of 91.40%. This demonstrates a significant improvement over traditional models
like AlexNet and VGG16, as well as more recent architectures like ResNet50 and
DenseNet121. The superior performance of our proposed method can be attributed
to its optimized CNN architecture, which effectively captures complex patterns in
skin lesion images, thereby enhancing classification accuracy and reliability. In
addition to this, the loss of the suggested model is internally analyzed and found to
Table 3 Comparison matrices of the proposed model

Diseases Precision Recall f1-score Support
Actinic keratoses (akiec) 0.50 80.42 0.46 26

Basal cell carcinoma (bcc) 0.41 0.87 0.55 30
Benign keratosis-like lesions (bkl) 0.77 0.13 0.23 75
Dermatofibroma (df) 0.10 0.50 0.17 6
Melanoma (mel) 0.28 0.49 0.35 39
Melanocytic nevi (nv) 0.94 0.91 0.92 751
Vascular lesions (vasc) 0.67 0.73 0.70 11
Average/total 0.524 0.579 0.483 938
13
Fig. 5 Confusion matrix for the proposed skin disease classification model
Table 4 Network performance as compared to different networks

Model Accuracy (%) Precision (%) Recall (%) F1-score (%)
AlexNet 75.34 76.21 75.10 75.65

VGG16 80.07 81.20 80.00 80.60
ResNet50 88.14 88.50 88.10 88.30
DenseNet121 89.50 89.90 89.40 89.65
Proposed method 91.25 91.60 91.20 91.40
be consistent with the model’s training progression. The analysis indicates that the
loss function of our proposed model demonstrates a decreasing trend over epochs,
showcasing the model’s learning capability. While the comparison with other
methods was not conducted in this specific context, the focus was on monitoring
the loss function’s behavior within our proposed model. The suggested model
initially predicts a loss that is on the higher side, but after several epochs, the loss
predictability improves. Figure 6 illustrates both the suggested model’s accuracy and
its lack of precision.
Table 5 illustrates the computational performance of various CNN models,
including AlexNet, VGG16, ResNet50, DenseNet121, and our proposed method.
The table compares the number of parameters, inference time, and memory usage
for each model. Our proposed method exhibits superior computational efficiency
with only 7.5 million parameters, an inference time of 1.6 milliseconds, and memory
13
Divya et al.
Table 5 Computational Model Parameters Inference Memory

performance comparison of (Millions) time (ms) usage (MB)
different CNN models
AlexNet 61.0 2.3 224
VGG16 138.0 3.2 512
ResNet50 25.6 1.9 256
DenseNet121 8.0 2.1 240
Proposed method 7.5 1.6 180
Fig. 6 Training accuracy and loss of the proposed model (Learning Rate = 0.01; Batch Size = 50 and
epochs = 30)
usage of 180 MB. This demonstrates that our method not only achieves higher
accuracy, but also offers significant improvements in computational efficiency,
making it more suitable for practical applications where resources are limited.
We additionally demonstrated the federated learning method using the same
dataset, and we made predictions for the average accuracy and the average loss
as a function of the number of users. The confidentiality of the users’ personal
information was a driving motivation for this action. In addition, we utilized a
generalized version of the FedAvg approach in order to bring the central model
up to date. After a number of batch updates have been carried out on the client
device using the FedAvg generalized strategy, the client’s model will deliver
13
updated weights rather than gradients as the result of these updates. The FedML
model metrics are provided for your consideration in the following table: reffed.
As looking at Table 6, it is easy to see that the accuracy improves as the number of
clients or devices increases, and in the other direction, it decreases as the average
loss profile is increased. The suggested model demonstrated a maximum accuracy
of 93.23% after being applied to 2000 different customers. On the other hand,
CNN algorithms demonstrated greater accuracy than the FedML. However, after
the number of training photographs has been increased, the FedML’s accuracy
will steadily improve over time. The greatest advantage of using FedML is that it
protects your privacy. Because only the updated weights are exchanged with the
centralized model, this helps to ensure that the confidentiality of the customers’
data is maintained. Therefore, anyone may train a model using personally
identifiable information without having to share the data that was collected from
their own devices in its original form. This happens to be the first time that we
are aware of that we have attempted to classify skin illnesses according to the
principles of federated learning. To the best of our knowledge, this is the case. As
a result, the implementation of this tactic will demonstrate a new element of the
technological society.
The model that has been proposed can only differentiate between seven distinct
skin types illnesses, despite the fact that people are known to have a vast array
of skin conditions. In addition to this, the model is unable to indicate the degree
of the seriousness of the illness. In addition, because there were not enough
picture sources, we were only able to train our model using a limited quantity of
data. Therefore, there is a possibility that performance will change following the
training of new data. In the future, we want to work toward expanding the quantity
of data contained inside the dataset and improving the overall effectiveness of the
federated learning model from a number of vantage points.
Table 7 presents a comparative analysis of various state-of-the-art methods
using the HAM10000 dataset for skin disease classification. The table lists
the accuracy and F1-Score of each method, highlighting the strengths and
weaknesses in their performance. Perez et al. [1] and Harangi [31] utilize data
augmentation and ensemble techniques, achieving accuracies of 74.3% and
78.8%, respectively. Brinker et al. [11] and Liu et al. [4] leverage ResNet50 and
multi-scale CNN architectures, with accuracies of 76.5% and 80.1%. Yang et al.
[32] employs an attention-based CNN, achieving 82.4% accuracy. Our proposed
method demonstrates superior performance, with an accuracy of 91.25% and an
F1-Score of 91.40%. This significant improvement underscores the efficacy of
Table 6 Performance Metrics Number of users Avg. accuracy (in %) Avg. loss (in %)
for FedAvg
500 82.42 0.332
1000 87.26 0.211
2000 93.23 0.13
13
Divya et al.
Table 7 Comparison of different methods using the HAM10000 dataset

Study Method Accuracy (%) F1-score (%)
Perez et al. [1] CNN with Data Augmentation 74.3 72.5

Harangi [31] Ensemble of CNNs 78.8 77.1
Brinker et al. [11] ResNet50 76.5 75.3
Liu et al. [4] Multi-Scale CNN 80.1 79.0
Yang et al. [32] Attention-based CNN 82.4 81.5
Proposed method Custom CNN 91.25 91.40
our custom CNN architecture in accurately classifying skin lesions, providing a

robust solution compared to existing approaches.
6 Conclusion
In this work, a novel CNN model was presented, and we investigated a federated
learning approach to solve the issue of data privacy. To do this, rather of the bespoke
dataset, we utilized the well-known and widely available HAM10000 dataset. The
effectiveness of the developed CNN model is assessed by contrasting it with that
of two other standard CNN algorithms, specifically AlexNet and VGG16 and more
recent industry-based approaches like ResNet50 and DenseNet121. The HAM10000
dataset is used for the training and validation of all three models, including the
one that has been proposed; the learning rate is set to 0.02, the batch size is set
to 50, and the epochs are set to 30. The suggested model demonstrated a greater
average accuracy of 86 %, as well as a higher recall percentage of 81 %, for seven
skin diseases. Additionally, the evaluation employs solely Keras GPU APIs to check
out various permutations of three preprocessing techniques applied to the condition
images: color highlighting, model transfer, and data balancing. Additionally, we
conducted a cross-validation study in addition to a training-test accuracy evaluation.
However, there is room for a few improvements in the model’s overall performance
as there is still room for improvement in the model’s overall performance. The
effectiveness of the suggested model is remarkable overall, which is advantageous
for dermatologists in their capacity to identify illnesses. When compared, the
federated learning algorithm that was tried with exhibited a maximum average
accuracy of 93.23% when the number of users was set at 2000. If there were further
training data and additional illness classes, the model that has been suggested would
be more effective and would be able to categorise a greater number of diseases. In
the future, we hope to broaden our work by doing the following:
• Evaluating the suggested model for a more general skin disease categorization
problem, including rashes, allergies, and bone illnesses.
• Combining Internet of Things (IoT) sensors and data management systems with
image-based illness identification algorithms to create a usable application.
13
• The capacity to support individual remote healthcare systems is being improved.
With the help of the suggested architecture, a remote skin disease detection
application may be made a reality.
Despite the promising results, our proposed model has some limitations. It
struggles with imbalanced datasets, particularly underperforming on minority
classes. Its generalization ability to other datasets needs further validation.
Additionally, its performance heavily depends on preprocessing quality, and it lacks
interpretability, making it challenging for clinicians to understand its decision-
making process. Addressing these issues in future work will enhance the model’s
robustness and applicability.
Author contributions Divya and Niharika have carried out all the experimental and validation part. The
writing part is done by these two authors only. Gaurav have reviewed the entire work and have proof read
the paper.
Funding Not Applicable.
Data availability All the datasets are open-source datasets that are available online, and the references
have been given in the paper.
Declarations
Conflict of interest This is an original paper and has not been submitted anywhere else, and none of the
authors have any Conflict of interest.
Ethical approval Not Applicable.
References
1. Hay RJ, Johns NE, Williams HC, Bolliger IW, Dellavalle RP, Margolis DJ, Marks R, Naldi L, Weinstock
MA, Wulf SK (2014) The global burden of skin disease in 2010: an analysis of the prevalence and
impact of skin conditions. J Investig Dermatol 134(6):1527–1534
2. Jones-Caballero M, Chren M, Soler B, Pedrosa E, Penas P (2007) Quality of life in mild to moderate acne:
relationship to clinical severity and factors influencing change with treatment. J Eur Acad Dermatol
Venereol 21(2):219–226
3. Cornish P, Mittmann N, Gomez M, Cartotto RC, Fish JS (2003) Cost of medications in patients admitted
to a burn center. Am J Clin Dermatol 4:861–867
4. Chen W, Zhang X, Zhang W, Peng C, Zhu W, Chen X (2018) Polymorphisms of slco1b1 rs4149056 and
slc22a1 rs2282143 are associated with responsiveness to acitretin in psoriasis patients. Sci Rep 8(1):1–9
5. Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan T, Ding D, Bagul A, Langlotz C, Shpanskaya K,
et al. (2017) Chexnet: radiologist-level pneumonia detection on chest x-rays with deep learning. arXiv
preprint arXiv:1711.05225
6. Lgktb BE (2017) Setio aaa ciompi f ghafoorian m van der laak ja van ginneken b sánchez ci a survey on
deep learning in medical image analysis. Med Image Anal 42(1995):60
7. Inthiyaz S, Altahan BR, Ahammad SH, Rajesh V, Kalangi RR, Smirani LK, Hossain MA, Rashed ANZ
(2023) Skin disease detection using deep learning. Adv Eng Softw 175:103361
8. Mahbod A, Schaefer G, Wang C, Ecker R, Ellinge I (2019) Skin lesion classification using hybrid deep
neural networks. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP), IEEE, pp 1229–1233
9. Mikołajczyk A, Grochowski M (2018) Data augmentation for improving deep learning in image
classification problem. In: 2018 International Interdisciplinary PhD Workshop (IIPhDW), IEEE, pp
117–122
13
Divya et al.
10. Back S, Lee S, Shin S, Yu Y, Yuk T, Jong S, Ryu S, Lee K (2021) Robust skin disease classification
by distilling deep neural network ensemble for the mobile diagnosis of herpes zoster. IEEE Access
9:20156–20169
11. Bewley A (2017) The neglected psychological aspects of skin disease. British Medical Journal
Publishing Group
12. Codella NC, Nguyen Q-B, Pankanti S, Gutman DA, Helba B, Halpern AC, Smith JR (2017) Deep
learning ensembles for melanoma recognition in dermoscopy images. IBM J Res Dev 61(4/5):5–1
13. Zhang X, Wang S, Liu J, Tao C (2018) Towards improving diagnosis of skin diseases by combining
deep neural network and human knowledge. BMC Med Inform Decis Mak 18(2):69–76
14. Shanthi T, Sabeenian R (2019) Modified alexnet architecture for classification of diabetic retinopathy
images. Comput Electr Eng 76:56–64
15. Tushabe F, Mwebaze E, Kiwanuka F (2011) An image-based diagnosis of virus and bacterial skin
infections. In: The International Conference on Complications in Interventional Radiology, pp 1–7
16. Sheha MA, Mabrouk MS, Sharawy A (2012) Automatic detection of melanoma skin cancer using
texture analysis. Int J Comput Appl 42(20):22–26
17. Gurovich Y, Hanani Y, Bar O, Nadav G, Fleischer N, Gelbman D, Basel-Salmon L, Krawitz PM,
Kamphausen SB, Zenker M (2019) Identifying facial phenotypes of genetic disorders using deep
learning. Nat Med 25(1):60–64
18. Choudhury O, Gkoulalas-Divanis A, Salonidis T, Sylla I, Park Y, Hsu G, Das A (2019) Differential
privacy-enabled federated learning for sensitive health data. arXiv preprint arXiv:1910.02578
19. Li W, Milletarì F, Xu D, Rieke N, Hancox J, Zhu W, Baust M, Cheng Y, Ourselin S, Cardoso MJ (2019)
Privacy-preserving federated brain tumour segmentation. In: Machine Learning in Medical Imaging:
10th International Workshop, MLMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China,
October 13, 2019, Proceedings 10, Springer, pp 133–141
20. Liu B, Yan B, Zhou Y, Yang Y, Zhang Y (2020) Experiments of federated learning for Covid-19 chest
x-ray images. arXiv preprint arXiv:2007.05592
21. Raghu M, Zhang C, Kleinberg J, Bengio S (2021) Transfusion: understanding transfer learning for
medical imaging. Nat Commun 12:3339
22. Yang L, Zhang R, Su Z, Li Y (2021) Attention-based convolutional neural network for skin lesion
classification. IEEE J Biomed Health Inform 25(5):1524–1532
23. Li X, Shen L, Xie X, Huang L (2022) Dense convolutional network with self-attention for skin disease
classification. IEEE Trans Med Imaging 41(2):475–484
24. Wang Y, Liu B, Zhang X (2023) Automated skin lesion segmentation and classification using deep
learning and attention mechanisms. Expert Syst Appl 214:118748
25. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural
networks. In: Advances in Neural Information Processing Systems
26. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017)
Attention is all you need. In: Advances in Neural Information Processing Systems
27. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer
M, Heigold G, Gelly S, Uszkoreit J, Houlsby N (2021) An image is worth 16x16 words: transformers
for image recognition at scale. In: International Conference on Learning Representations
28. Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision
transformer using shifted windows. In: IEEE International Conference on Computer Vision
29. Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural
networks. Commun ACM 60(6):84–90
30. Tschandl P, Rosendahl C, Kittler H (2018) The ham10000 dataset, a large collection of multi-source
dermatoscopic images of common pigmented skin lesions. Sci Data 5(1):1–9
31. Harangi B (2018) Skin lesion classification with ensembles of deep convolutional neural networks. J
Biomed Inform 86:25–32
32. Nassiri K, Akhloufi MA (2024) Recent advances in large language models for healthcare.
BioMedInformatics 4(2):1097–1143
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under
a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of such publishing agreement and
applicable law.
13
Authors and Affiliations
Divya1 · Niharika Anand1 · Gaurav Sharma2
* Niharika Anand
niharika@iiitl.ac.in
Divya
rwc201003@iiitl.ac.in
Gaurav Sharma
g.gaurav@sheffield.ac.uk
1
Department of Information Technology, Indian Institute of Information Technology, Lucknow,
India
2
University of Sheffiled, Sheffiled, England, UK
13

Convolutional Neural Network (CNN) and Federated Learning Based Privacy Preserving Approach For Skin Disease Classification

Uploaded by

Copyright:

Available Formats

Convolutional Neural Network (CNN) and Federated Learning Based Privacy Preserving Approach For Skin Disease Classification

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Convolutional Neural Network (CNN) and Federated Learning Based Privacy Preserving Approach For Skin Disease Classification

Uploaded by

Copyright:

Available Formats

The Journal of Supercomputing

Convolutional neural network (CNN) and federated

Divya1 · Niharika Anand1 · Gaurav Sharma2

Accepted: 13 June 2024

Keywords Skin disease classification · Medical imaging · Convolutional neural

Extended author information available on the last page of the article

A variety of scholarly studies have focused on the categorization and identification

algorithm, trained on 17,000 face pictures of genetic syndromes. This proposed

Esteva et al. [24] Deep Learning (inception V3 Architecture) 2032 9 55.4%

Zhang et al. [13] Deep Learning (inception V3 Architecture) 2420 4 87.25%

Fig. 1 Proposed model for skin disease classification

Every CNN is comprised of an essential component called the Convolution

The suggested CNN model’s activation layer is in charge of combining a neuron’s

3.4 Fully connected layer

Fig. 2 Framework for the proposed model

by dermatologists with one of seven different diagnostic categories. The categories

Fig. 3 Dataset distribution into training data and validation data

Fig. 4 Sample images from dataset

5 Result and discussion

important aspect of this research as it has contributed significantly to achieving the

Table 3 Comparison matrices of the proposed model

Actinic keratoses (akiec) 0.50 80.42 0.46 26

Table 4 Network performance as compared to different networks

AlexNet 75.34 76.21 75.10 75.65

Table 5 Computational Model Parameters Inference Memory

Table 7 Comparison of different methods using the HAM10000 dataset

Perez et al. [1] CNN with Data Augmentation 74.3 72.5

our custom CNN architecture in accurately classifying skin lesions, providing a

• The capacity to support individual remote healthcare systems is being improved.

Funding Not Applicable.

Ethical approval Not Applicable.

Authors and Affiliations

Divya1 · Niharika Anand1 · Gaurav Sharma2

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

3.4 Fully connected layer

5 Result and discussion