Mathematics 11 04566
Mathematics 11 04566
Mathematics 11 04566
Article
A Deep Multi-Task Learning Approach for Bioelectrical
Signal Analysis
Jishu K. Medhi 1 , Pusheng Ren 1 , Mengsha Hu 2 and Xuhui Chen 1, *
1 College of Aeronautics and Engineering, Kent State University, Kent, OH 44240, USA;
jmedhi@kent.edu (J.K.M.); pren1@kent.edu (P.R.)
2 Department of Computer Science, Kent State University, Kent, OH 44240, USA; mhu8@kent.edu
* Correspondence: xchen58@kent.edu
Abstract: Deep learning is a promising technique for bioelectrical signal analysis, as it can automat-
ically discover hidden features from raw data without substantial domain knowledge. However,
training a deep neural network requires a vast amount of labeled samples. Additionally, a well-
trained model may be sensitive to the study object, and its performance may deteriorate sharply when
transferred to other study objects. We propose a deep multi-task learning approach for bioelectrical
signal analysis to address these issues. Explicitly, we define two distinct scenarios, the consistent
source-target scenario and the inconsistent source-target scenario based on the motivation and purpose of
the tasks. For each scenario, we present methods to decompose the original task and dataset into
multiple subtasks and sub-datasets. Correspondingly, we design the generic deep parameter-sharing
neural networks to solve the multi-task learning problem and illustrate the details of implementation
with one-dimension convolutional neural networks (1D CNN), vanilla recurrent neural networks
(RNN), recurrent neural networks with long short-term memory units (LSTM), and recurrent neural
networks with gated recurrent units (GRU). In these two scenarios, we conducted extensive exper-
iments on four electrocardiogram (ECG) databases. The results demonstrate the benefits of our
approach, showing that our proposed method can improve the accuracy of ECG data analysis (up to
5.2%) in the MIT-BIH arrhythmia database.
can significantly save the huge human resources consumption in the traditional way of
data analysis.
One of the complexities of bioelectrical signals is their nonlinear nature, which makes
it difficult to process them accurately using traditional statistical tools. Currently, only fully
trained and experienced physicians can perform such tasks and, even then, the clinician’s
judgment is still accompanied by a certain amount of error. Fortunately, deep learning,
an advanced method among the aforementioned systems, has strong capabilities in nonlin-
ear functional learning and has found extensive applications in multi-dimensional signal
processing problems [8]. Therefore, combining deep learning with bioelectrical signal
analysis is a promising exploration direction. Some previous works [9,10] have shown that
the method can achieve state-of-art performance. Lipton et al. [9] trained a long short-term
memory (LSTM) recurrent neural network to diagnose heart diseases automatically and
obtained 86.6% classification accuracy. Rajpurkar et al. [10] classified heartbeat arrhyth-
mias with a proposed 34-layer convolutional neural network, which exceeds the average
cardiologist’s performance in both sensitivity and precision. The above attempts ensure the
accuracy of signal processing while greatly saving the consumption of manpower, which
undoubtedly strengthens our confidence to continue exploring in this direction.
One of the barriers to the development of deep learning techniques is the limited
amount of labeled data available for training, which causes the data sparsity problem,
and further becomes a major bottleneck for applying deep learning in bioelectrical sig-
nals analysis. Normally, it requires millions of labeled data points to train a deep neural
network [11–13]. In addition, data labeling usually needs to be performed manually, lead-
ing to the extremely time-consuming process of signal labeling.
Moreover, a well-trained deep model may be sensitive to the study objects. When
transferred to other study objects, the performance of the predictive model may deteriorate
sharply. The main reason behind this phenomenon is that bioelectrical signals can differ
substantially among patients [7,14]. For instance, seizure morphology on EEG signals can
vary among different patients due to the diverse neuroanatomical and pathophysiological
causes of epileptic disease; the same arrhythmia may have divergent morphology on ECG
signals for different patients. Two different arrhythmias can also produce nearly identical
effects on standard ECG signals as well because of the various electrodes’ positions [15].
Therefore, the scope and complexity of the data used to train the deep network needs to be
further enhanced, and the structure of the system itself needs to be optimized accordingly
in order to improve the transferability, generalizability, and robustness of the system.
Multi-task learning offers an attractive solution, via reusing knowledge learned by
other similar tasks, to address the issues of data sparsity and object sensitivity in the target
task. Multi-task learning trains several similar tasks jointly to capture the commonalities
among them. The individual task utilizes the shared knowledge to boost its own perfor-
mance. It has achieved great success in many fields, including speech recognition [16,17],
computer vision [18,19], text mining [20,21], and drug discovery [22,23] .
A preliminary version of this work has been reported [24], which focuses on utilizing
deep multi-task learning to increase the performance of ECG arrhythmias detection and
classification. We defined two scenarios to apply multi-task learning, the consistent source-
target scenario and the inconsistent source-target scenario. We initially proposed one deep
multi-task learning structure for both scenarios and person-wise study objects in source
and target domains. The experiments are conducted on one dataset (MIT-BIH Arrhythmia
Database), which is not thorough enough to evaluate the proposed scheme’s efficacy
and transferability.
This paper extends the deep multi-learning scheme to the broader bioelectrical signal
analysis. Specifically, based on the initially proposed two distinct scenarios, the study
objects are no longer considered person-wise but are expanded to a group of patients’
data from four databases. Moreover, we proposed two distinct generic parameter-sharing
neural networks for these two scenarios to improve the transferability. The experiments
on four databases demonstrate that our proposed system can increase the performance
Mathematics 2023, 11, 4566 3 of 17
of bioelectrical signal analysis. Thereby, the predictive model learned from the multi-task
learning approach has a better transferable performance.
We summarize the major contributions as follows:
• Based on the purpose and motivation behind data analysis tasks, we define two
different scenarios for utilizing multi-task learning in analyzing bioelectrical signals,
the consistent source-target scenario and the inconsistent source-target scenario, which is
proved to enhance the transferability of the proposed schemes.
• For each scenario, we propose a method to decompose the analysis task into several
subtasks and convert the original dataset to adapt to the multi-task learning. We also
design the generic parameter-sharing neural networks for each scenario and illustrate
the details of implementing different basic neural layers, like convolutional layers and
recurrent layers.
• We conduct extensive experiments on four electrocardiograms databases. The ex-
periment’s results demonstrate that the proposed systems can improve the analysis
performance and enable the predictive models to be transferable.
The rest of this paper is organized as follows: Section 2 preliminarily introduces deep
learning and multi-task learning. In Section 3, we propose two different scenarios—the
consistent source-target scenario and the inconsistent source-target scenario—to apply multi-
task learning in the bioelectrical signal analysis. Sections 4 and 5 illustrate the details of
applying deep multi-task learning in those two scenarios, respectively. Section 6 describes
the experiments on arrhythmia classification using ECG signals, and its results are presented
in Section 7. We discuss the findings of our work in Section 8 and, finally, summarize the
paper in Section 9.
2. Preliminaries
2.1. Deep Learning
Deep learning serves as an incredibly powerful tool for various machine learning
applications [25,26]. It can easily generate accurate predictive models through the uti-
lization of deep neural networks (DNNs). Typically, DNNs consist of multiple layers of
nonlinear processing nodes, allowing them to learn feature representations through each
layer. The earliest framework of deep neural networks is based on multi-layered artificial
neural networks (ANNs) inspired by biological neurons in the brain [27]. ANNs are still
widely used for approximating mathematical functions and regression analysis making
them very useful for tasks like predicting the behavior of financial markets, modeling
physical systems, and predicting real estate prices [28,29]. Nonetheless, the broad impact
of deep learning becomes apparent in 2006 [30–32]. Since then, deep learning has been
successfully applied to a wide range of fields, including computer vision, natural language
processing, and bioinformatics [7,24,33].
In this paper, we focus on bioelectrical signal analysis, in which the available data
constitute a time sequence-series. Traditionally, DNNs lacked the ability to model the
dynamic temporal behavior of such data. Thus, many researchers proposed plenty of DNN
variants to exhibit the temporal features of the data, like 1-Dimension convolutional neural
networks (1D CNNs) [34], vanilla recurrent neural networks (RNNs) [35], convolutional
long-short term memory units (ConvLSTMs) [36], recurrent neural networks with long-
short memory units (LSTMs) [37], and recurrent neural networks with gated recurrent
units (GRUs) [38]. In 2019, Hasan et al. [39] trained an 1D convolutional neural network to
diagnose cardiovascular disease automatically. Hannun et al. [40] proposed a bidirectional
LSTM convolutional neural network to detect and classify the heartbeat arrhythmias and
achieved an accuracy of 96.59%, a sensitivity of 99.93% and a specificity of 97.03%. More
recently, Hu et al. [41] proposed a novel CNN-transformer based deep learning model to
classify heartbeats from continuous ECG signals for arrhythmia detection and achieved an
overall accuracy of 99.12%.
xt xT
ht ... hT
Mathematics 2023, 11, 4566 4 of 17
ht-1 ht
Specifically, for a bioelectrical signal analysis task, we first expand it into several similar
and related subtasks, which is a task space partitioning process. Then, based on these
subtasks, we convert the original dataset to sub-datasets to adapt to the multiple subtasks.
Finally, 1-
we design a generic parameter-sharing neural network to train these subtasks
simultaneously in a deep multi-task setting.
The primary challenge of designing a deep multi-task learning scheme for bioelectrical
signal analysis is how to transform one single analysis task into several similar and related
subtasks. We assume that a patient/a group of patients, upon whose data we obtain
tanh
the predictive model, is the source domain or source study object. Correspondingly, we
call another patient/group of patients, to whom the predictive model will be applied to
analyze the data, as target domain or target study object. In this paper, we propose two
distinct methods to conduct task space partitioning for two scenarios, respectively, where
the scenarios are distinguished based on the consistency of study subject. Specifically,
if the source and target study objects are the same patient/group of patients, we name this
xt scenario the consistent source-target scenario, where the goal of utilizing multi-task learning
is to increase the analysis precision. Otherwise, if the source and target study objects
are different patients/groups of patients, we name this scenario the inconsistent source-
target scenario, where the objective of utilizing multi-task learning is to learn a transferable
predictive model for bioelectrical signal analysis. Since the sub-datasets building process
and parameter-sharing neural network designing process heavily depend on the task
partitioning, we introduce them for the source-target consistent and inconsistent scenarios
in the following, respectively.
Bioelectrical signal
Task 1
analysis task
Task 2
...
Constructing Dataset n
corresponding
datasets
output
.
.
.
Task 2
Task 1 Task 1 Task 1 input
input output
Mathematics 2023, 11, 4566 6 of 17
Task 2 input Sh
Task 2 Shared Task 2
Mo
input output
...
Model
Intuitively, each subtask only focuses on solving a simple subproblem in the entire
...
...
complicated task space. Meanwhile, as a system, subtasks collaboratively assist each other
Task n input
Task n to improve their performance Task n own problems. Therefore, the multi-task learning
in their
input system has a better performance outputthan a single task system.
According to the task decomposition in the consistent source-target scenario, we rebuild
the original datasets. For original dataset D in a multi-class classification setting, data sam-
ple di is { xis :ie , yi }, where xis :ie = { xis ⊕ xis +1 ⊕ · · · ⊕ xie } is the bioelectrical signal sequence
and yi ∈ {1, 2, · · · , K } is the class label, and K is the number of categories. We rebuild the
multi-class datasets to k binary classification dataset D ∗ . di∗ is { xis :ie , y1i , y2i , · · · , yiK }, where
yim ∈ {0, 1} is the label for subtask m.
Output Task 1
Layer 1 output
Output Task 2
Bioelectrical Shared
Layer 2 output
signals Neural Layers
...
Output Task K
Layer K output
Figure 3. Model architecture of deep parameter-sharing neural networks in the consistent source-
target scenario.
In particular, the output of the shared neural layers is the hidden feature representation:
hs = SharedLayers xi:i+ j . (1)
The output of subtask i is:
ỹi = sigmoid w T hs + b . .
(2)
.
.
.
.
Since we decomposed the multi-class classification task into several binary classification .
1D 1D
.
Bioelectrical
.
Convolutional
where n is the number of sequence data. Max-pooling Flatten
signals
Thus, combining the loss functions of subtasks, the overall loss function is derived for
layer Layer
the whole network:
K
L= ∑ Li . (4)
i =1
Shared Neural Layers
Usually, a bioelectrical signal is the time series data of electrical measurement. In deep
.
.
.
learning, researchers always utilize convolutional neural networks and recurrent neural
.
networks to capture the features of time sequence data. Accordingly, we design parameter-
sharing convolutional neural networks and parameter-sharing recurrent neural networks
for the consistent source-target scenario, respectively.
1-
ht
tanh
Mathematics 2023, 11, 4566 7 of 17
Output Task 1
4.2.1. Parameter-Sharing Convolutional
Layer 1 Neural Networks
output
We utilize 1DShared
Bioelectrical
t x
convolutional neuralTask
Output
Layer 2
networks
2
output
to capture the temporal features from
bioelectrical
signals signals. Specifically,
Neural Layers as shown in Figure 4, the signal goes through a shared 1D
...
convolutional neural layer and a shared 1D max-pooling layer. Then, a shared flattened
Output Task K
layer converts the extracted feature Layer K tensor to a feature vector h. For task i, it concatenates
output
its private fully connected layer to the flattened layer to extract its interested features.
After this, the task i utilizes another fully connected layer with sigmoid activation σ (·) to
determine whether the signal sequence belongs to its type.
.
output
.
Shared
.
Task 2 input Task 2 output Task 2
Model Bioelectrical Recurrent Neural
...
...
...
signals Layers
Task 2
Task n input Task n output output Task n
.
1D 1D Designing
.
Bioelectrical
.
Convolutional Max-pooling Flatten Expanding
signals
layer Layer to multiple deep Neural Laye
Shared
...
tasks parameter-
sharing
Shared Neural Layers Task K Dataset 1 networks
output
.
.
.
.
.
Labeled data Dataset 2
Figure 4. Model architecture of deep parameter-sharing convolutional neural networks in the
...
consistent source-target scenario.
Constructing Dataset n
corresponding
4.2.2. Parameter-Sharing Recurrent Neural Networks datasets
We also implement RNN to extract features. As shown in Figure 5, for task i,
ask 1 Task 1 Neural Output Task 1
theinput
parameter-sharing
Layers 1 neural networks
Layer 1 extract
outputhidden representation vector h from the bio-
utput
electrical
Task 2 signal
Neural via a shared
Shared recurrent
Output neural
Tasklayer.
2 In the parameter-sharing convolutional
ask 2 neural Layers 2 task iNeural
input networks, appends twoLayer 2fully-connected
output layers to the output of the recurrent
utput Layers
...
...
Task 1
.
output
.
.
.
1D
Task 1 Max- Task 1
Task 1 Convolutional Flatten Task 1
input pooling output
.
.
.
1D Task 2 Task 2
Task 2
Bioelectrical Max- Task 2 input
Recurrent Neural
Convolutional Flatten output
.
.
.
...
...
Task 2 Task M
...
...
...
...
...
...
output
.
input
.
.
.
Flatten 1D
Task M Max- Task M
Convolutional Flatten
input Shared Neural Layers
pooling output
...
layer Task K
Fully Connected
output
.
Layer
.
.
.
Figure 5. Model architecture of deep parameter-sharing recurrent neural networks in the consistent
.
.
.
source-target scenario.
Task M
Mathematics 2023, 11, 4566 8 of 17
learn a transferable predictive model for bioelectrical signal analysis. Similarly, we divide
the multi-class classification task into several multi-class subtasks. Specifically, each subtask
conducts the multi-class analysis as the original
Output task, Task
but the
1 bioelectrical data in its dataset
are from one patient/one group of patients. output these subtasks simultaneously,
Layer 1 By solving
the multi-task learning model can capture the invariant Task 2features among all the subtasks,
Bioelectrical Shared Output
which makes the model more transferable.
Layer 2 output
signals Neural
According to the taskLayers
decomposition in the inconsistent source-target scenario, we rebuild
...
the original datasets. For a multi-task setting, the original dataset D is formulated as:
Output Task K
x1s :1e K y1 output
Layer
x2s :2e y2
(5)
.. ..
. .
xns :ne yn ,
where n is the number of data samples, xis :ie = xis :is +1 ⊕ xis :is +2 , ⊕ · · · xis :ie is the time
sequence data, and yi ∈ {1, 2, · · · , K }, K is the number of categories.
Therefore, we decompose D into several sub-datasets D1∗ , D2∗ , · · · , D ∗M , whereTask M is1 the
.
output
.
number of subtasks. For task i, its sub-dataset Di is formulated as:
.
x1i s :1e y1i
i
x2s :2e y2i
Task 2
output (6)
. ..
.
1D 1D .
.
. .
Bioelectrical
.
Convolutional Max-pooling Flatten
signals
layer xni is :nie yini ,
Layer
...
where ni is the number of data samples in task i.
Shared Neural Layers Task K
5.2. Deep Parameter-Sharing Neural Networks in Inconsistent Source-Target Scenario
output
.
.
.
.
.
We also propose deep parameter-sharing neural networks for the inconsistent source-
target scenario. Still, the deep parameter-sharing neural network is a generic network
structure, and any neural network components can be applied to it. As shown in Figure 6,
each subtask can apply any existing neural network (blue block) to extract its individual
features. Subsequently, the outputs from the shared neural layers (red block) from all the
tasks are linked to the individual output layers of each task, generating their respective
outputs.
...
Figure 6. Model architecture of deep parameter-sharing neural network in the inconsistent source-
target scenario.
hi = Input( xi ).
...
...
...
...
...
...
Mathematics 2023, 11, 4566 9 of 17
Task 1
.
output
.
.
.
Bioelectrical
We consider the analysis as a multi-class classification problem. Then the output signals
of
Task 2
task i is:
output
.
1D 1D
.
Bioelectrical
.
Convolutional ỹ = softmax(Flatten
Max-pooling whs + b).
signals
layer Layer
...
Correspondingly, the loss function for task i as:
i n K
Shared Neural Layers Task K
∑ ∑ yip log
q q
Li = − ỹip . output
.
.
.
.
.
p =1 q =1
The overall loss function of the network can then be obtained as:
M
L= ∑ Li .
i =1
...
As depicted in Figure 7, the training dataset for each task i undergoes a series of
Task M Neural Output Task M
transformations,
input beginning with a 1D convolutional
output
neural layer and followed by a 1D
Layers M Layer M
max-pooling layer. Subsequently, a flattened layer is applied to convert the resulting
feature tensor into a feature vector, represented by hi . By combining this feature vector
with those from other tasks, we create a shared feature vector for all tasks, denoted by
h = h1 ⊕ h2 ⊕ · · · ⊕ h M . Following this, the network incorporates several fully-connected
layers, with parameters shared among all relevant tasks to connect to each task’s dedicated
output layer.
1D
Task 1 Max- Task 1
Convolutional Flatten
input pooling output
layer
1D
Task 2 Max- Task 2
Convolutional Flatten
input pooling output
layer
...
...
...
...
...
...
...
1D
Task M Max- Task M
Convolutional Flatten
input pooling output
layer
Fully Connected
Layer
Figure 7. Model architecture of deep parameter-sharing convolutional neural network in the inconsis-
tent source-target scenario.
Task 1 Task 1
Task 1
output RNN output
input
Task 2 Task 2
Task 2 RNN
input output
output
...
...
...
...
...
Task M Task M
...
RNN
input output
Task M
Fully Connected Task 1 Task 1
output
d Layer
Shared
Shared Neural Layers Task 2 Learning Task 2
System
...
...
yers
Figure 8. Model architecture of deep parameter-sharing recurrent neural network in the inconsistent
source-target scenario. Task n Task n
by independent experts. There are two leads’ readings in each record, and we choose the
first lead signals in the experiments.
7. Results
7.1. Experiments on Consistent Source-Target Scenario
As described in Section 3, the domain can be person-wise or database-wise. Regarding
person-wise, we select the MIT-BHI Long Term Database (ltdb) [50] because of its sufficient
recordings of each person for deep neural networks’ training purpose while the other
three databases only contain 30-min data, which are not enough for training a deep neural
network. Regarding the database-wise domain, we perform the simulation four times on
each database independently to validate the efficacy of the proposed MTL framework. We
give the MLT deep neural networks training and testing details separately and compare
the results with the ones under STL.
1/5 as testing. We repeat the same process using another recording, 14,134, and record
these two experiment results in Table 1.
8. Discussion
8.1. Consistent Source-Target Scenario
The testing accuracy is the ratio of the number of correctly classified beats to the
number of total beats in the testing phase. The STL schemes conduct the traditional machine
learning process where each deep neural network has one multi-class output. From Table 1,
we can see that average accuracy of person-wise MTL neural networks increase 2.5% and
2.2% compared with STL networks. Since the MTL separates all the arrhythmia classes
in the training process, it is easier for neural networks to address the difference between
classes and thus has better classification performance compared with training all classes
together. Additionally, we compare it with the results from [52], where the neural networks
are trained on different proportions of the training data in each recording. In their results,
the highest testing accuracy is 97% for record 14,046 while our MTL is 98.2%, and 87%
for record 14,134 while ours is 92.2%. Therefore, the proposed MTL parameter-sharing
framework can improve the performance of predictive model.
When compared to the results of STL networks, the average testing accuracy of
database-wise MTL increases by 3.8%, 3.0%, 3.5%, and 2.3% for the four selected datasets,
respectively, as shown in Table 2. The results from these experiments suggest that the beat
classification accuracy can be improved by deep MTL parameter-sharing systems when
training and testing on the same database.
Mathematics 2023, 11, 4566 15 of 17
9. Conclusions
In this paper, we have investigated the accuracy improvement problem of analyzing
bioelectrical data by employing methods based on deep learning. To achieve this goal,
a deep MTL scheme has been proposed to reuse the knowledge from source domains to
target domain. In particular, we initially reframe the bioelectrical signal analysis problem
as a multi-task learning problem by segmenting the data analysis into a list of tasks and
then create corresponding datasets for those tasks. Then, we train the parameter-sharing
neural network for these tasks and apply the shared layers to the target domain. Any
generic deep learning network can be utilized in the framework, and we implement four
networks as examples—1D CNN, vanilla RNN, LSTM, and GRU. To evaluate the proposed
approach, we conduct extensive experiments on arrhythmia classification using four public
ECG databases. For each scenario, we test the system in person-wise and database-wise
domains separately. The experiment’s results show that the proposed framework can
improve the classification accuracy in all situations, which means our system successfully
transfers the knowledge from the source domain to the target domain, which also means
the parameter-sharing layers can capture the common features and get rid of personal
features from records or databases. The accuracy can be improved by up to 5.2% in the
MIT-BIH Arrhythmia Database using LSTM networks.
Author Contributions: Conceptualization, X.C.; Methodology, X.C.; Software, J.K.M. and P.R.;
Validation, J.K.M. and P.R.; Formal analysis, J.K.M.; Data curation, P.R.; Writing—original draft,
X.C.; Writing—review & editing, J.K.M., P.R. and M.H.; Visualization, P.R. and M.H.; Supervision,
Mathematics 2023, 11, 4566 16 of 17
X.C.; Project administration, X.C. All authors have read and agreed to the published version of
the manuscript.
Funding: This research received no external funding.
Data Availability Statement: The data presented in this study are openly available in PhysioNet as
follows: mitdb database available at https://doi.org/10.13026/C2F305; ltdb database available at
https://doi.org/10.13026/C2KS3F; svdb database available at https://doi.org/10.13026/C2V30W;
incartdb database available at https://doi.org/10.13026/C2V88N.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Theis, F.J.; Meyer-Bäse, A. Biomedical Signal Analysis: Contemporary Methods and Applications; MIT Press: Cambridge, MA, USA,
2010.
2. Choi, B.J.; Kim, J.H.; Yang, W.J.; Han, D.J.; Park, J.; Park, D.W. Parylene-based flexible microelectrode arrays for the electrical
recording of muscles and the effect of electrode size. Appl. Sci. 2020, 10, 7364. [CrossRef]
3. Aoyama, T.; Kohno, Y. Temporal and quantitative variability in muscle electrical activity decreases as dexterous hand motor
skills are learned. PLoS ONE 2020, 15, e0236254. [CrossRef] [PubMed]
4. Behadada, O.; Chikh, M.A. An interpretable classifier for detection of cardiac arrhythmias by using the fuzzy decision tree. Artif.
Intell. Res. 2013, 2, 45. [CrossRef]
5. Guler, I.; Ubeyli, E.D. Multiclass support vector machines for EEG-signals classification. IEEE Trans. Inf. Technol. Biomed. 2007,
11, 117–126. [CrossRef] [PubMed]
6. Frénay, B.; De Lannoy, G.; Verleysen, M. Improving the transition modelling in hidden Markov models for ECG segmentation.
In Proceedings of the ESANN, Bruges, Belgium, 22–24 April 2009.
7. Chen, X.; Ji, J.; Loparo, K.; Li, P. Real-time personalized cardiac arrhythmia detection and diagnosis: A cloud computing
architecture. In Proceedings of the 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI), Orlando,
FL, USA, 16–19 February 2017; pp. 201–204.
8. Sze, V.; Chen, Y.H.; Yang, T.J.; Emer, J. Efficient processing of deep neural networks: A tutorial and survey. arXiv 2017,
arXiv:1703.09039.
9. Lipton, Z.C.; Kale, D.C.; Elkan, C.; Wetzell, R. Learning to diagnose with LSTM recurrent neural networks. arXiv 2015,
arXiv:1511.03677.
10. Rajpurkar, P.; Hannun, A.Y.; Haghpanahi, M.; Bourn, C.; Ng, A.Y. Cardiologist-level arrhythmia detection with convolutional
neural networks. arXiv 2017, arXiv:1707.01836.
11. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014;
pp. 580–587.
12. Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556.
13. Chen, J.; Samuel, R.D.J.; Poovendran, P. LSTM with bio inspired algorithm for action recognition in sports videos. Image Vis.
Comput. 2021, 112, 104214. [CrossRef]
14. Shoeb, A.H.; Guttag, J.V. Application of machine learning to epileptic seizure detection. In Proceedings of the 27th International
Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 975–982.
15. Jambukia, S.H.; Dabhi, V.K.; Prajapati, H.B. Classification of ECG signals using machine learning techniques: A survey. In
Proceedings of the 2015 International Conference on Advances in Computer Engineering and Applications, Ghaziabad, India,
19–20 March 2015; pp. 714–721.
16. Woodland, P.C. Speaker adaptation for continuous density HMMs: A review. In ISCA Tutorial and Research Workshop (ITRW) on
Adaptation Methods for Speech Recognition; ISCA: Sophia-Antipolis, France, 2001.
17. Li, X.; Bilmes, J. Regularized adaptation of discriminative classifiers. In Proceedings of the 2006 IEEE International Conference on
Acoustics Speech and Signal Processing Proceedings, Toulouse, France, 14–19 May 2006; Volume 1, p. I .
18. Raina, R.; Battle, A.; Lee, H.; Packer, B.; Ng, A.Y. Self-taught learning: Transfer learning from unlabeled data. In Proceedings of
the 24th International Conference on Machine Learning, Corvalis, OR, USA, 20–24 June 2007; pp. 759–766.
19. Yang, J.; Yan, R.; Hauptmann, A.G. Cross-domain video concept detection using adaptive svms. In Proceedings of the 15th ACM
International Conference on Multimedia, Augsburg, Germany, 25–29 September 2007; pp. 188–197.
20. Blitzer, J.; McDonald, R.; Pereira, F. Domain adaptation with structural correspondence learning. In Proceedings of the 2006
Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, 22–23 July 2006; pp. 120–128.
21. Dai, W.; Yang, Q.; Xue, G.R.; Yu, Y. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine
Learning, Corvalis, OR, USA, 20–24 June 2007; pp. 193–200.
22. Ruder, S. An overview of multi-task learning in deep neural networks. arXiv 2017, arXiv:1706.05098.
23. Lin, S.; Shi, C.; Chen, J. GeneralizedDTA: Combining pre-training and multi-task learning to predict drug-target binding affinity
for unknown drug discovery. BMC Bioinform. 2022, 23, 367. [CrossRef] [PubMed]
Mathematics 2023, 11, 4566 17 of 17
24. Ji, J.; Chen, X.; Luo, C.; Li, P. A deep multi-task learning approach for ECG data analysis. In Proceedings of the 2018 IEEE EMBS
International Conference on Biomedical & Health Informatics (BHI), Las Vegas, NV, USA, 4–7 March 2018; pp. 124–127.
25. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436. [CrossRef] [PubMed]
26. Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [CrossRef]
27. Minsky, M.; Papert, S.A.; Bottou, L. Perceptrons: An Introduction to Computational Geometry; MIT Press: Cambridge, MA, USA, 2017.
28. Rafiq, M.; Bugmann, G.; Easterbrook, D. Neural network design for engineering applications. Comput. Struct. 2001, 79, 1541–1552.
[CrossRef]
29. Liu, S.; Borovykh, A.; Grzelak, L.A.; Oosterlee, C.W. A neural network-based framework for financial model calibration. J. Math.
Ind. 2019, 9, 9. [CrossRef]
30. Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [CrossRef]
31. Bengio, Y. Learning deep architectures for AI. Found. Trends Mach. Learn. 2009, 2, 1–127. [CrossRef]
32. Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach.
Intell. 2013, 35, 1798–1828. [CrossRef]
33. Socher, R.; Bengio, Y.; Manning, C.D. Deep learning for NLP (without magic). In Proceedings of the 50th Annual Meeting of the
Association for Computational Linguistics: Tutorial Abstracts, Jeju Island, Republic of Korea, 8–14 July 2012; p. 5.
34. Hu, B.; Lu, Z.; Li, H.; Chen, Q. Convolutional neural network architectures for matching natural language sentences. arXiv 2014,
arXiv:1503.03244. [CrossRef]
35. Medsker, L.; Jain, L. Recurrent neural networks. Des. Appl. 2001, 5, 2.
36. Hammad, M.; Abd El-Latif, A.A.; Hussain, A.; Abd El-Samie, F.E.; Gupta, B.B.; Ugail, H.; Sedik, A. Deep learning models for
arrhythmia detection in IoT healthcare applications. Comput. Electr. Eng. 2022, 100, 108011. [CrossRef]
37. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [CrossRef] [PubMed]
38. Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations
using RNN encoder-decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078.
39. Hasan, N.I.; Bhattacharjee, A. Deep learning approach to cardiovascular disease classification employing modified ECG signal
from empirical mode decomposition. Biomed. Signal Process. Control 2019, 52, 128–140. [CrossRef]
40. Hannun, A.Y.; Rajpurkar, P.; Haghpanahi, M.; Tison, G.H.; Bourn, C.; Turakhia, M.P.; Ng, A.Y. Cardiologist-level arrhythmia
detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 2019, 25, 65–69. [CrossRef]
41. Hu, R.; Chen, J.; Zhou, L. A transformer-based deep neural network for arrhythmia detection using continuous ECG signals.
Comput. Biol. Med. 2022, 144, 105325. [CrossRef]
42. Shahin, M.; Oo, E.; Ahmed, B. Adversarial Multi-Task Learning for Robust End-to-End ECG-based Heartbeat Classification. In
Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC),
Montreal, QC, Canada, 20–24 July 2020; pp. 341–344.
43. Mormont, R.; Geurts, P.; Marée, R. Multi-task pre-training of deep neural networks for digital pathology. IEEE J. Biomed. Health
Inform. 2020, 25, 412–421. [CrossRef]
44. Misra, I.; Shrivastava, A.; Gupta, A.; Hebert, M. Cross-stitch networks for multi-task learning. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3994–4003.
45. Zhang, Z.; Luo, P.; Loy, C.C.; Tang, X. Facial landmark detection by deep multi-task learning. In Proceedings of the European
Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 94–108.
46. Collobert, R.; Weston, J. A unified architecture for natural language processing: Deep neural networks with multitask learning.
In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 160–167.
47. Luong, M.T.; Le, Q.V.; Sutskever, I.; Vinyals, O.; Kaiser, L. Multi-task sequence to sequence learning. arXiv 2015, arXiv:1511.06114.
48. Liu, P.; Qiu, X.; Huang, X. Adversarial Multi-task Learning for Text Classification. arXiv 2017, arXiv:1704.05742.
49. Liu, P.; Qiu, X.; Huang, X. Recurrent neural network for text classification with multi-task learning. arXiv 2016, arXiv:1605.05101.
50. Goldberger, A.L.; Amaral, L.A.; Glass, L.; Hausdorff, J.M.; Ivanov, P.C.; Mark, R.G.; Mietus, J.E.; Moody, G.B.; Peng, C.K.; Stanley,
H.E. Physiobank, physiotoolkit, and physionet. Circulation 2000, 101, e215–e220. [CrossRef] [PubMed]
51. Cantzos, D.; Dimogianopoulos, D.; Tseles, D. ECG diagnosis via a sequential recursive time series—Wavelet classification scheme.
In Proceedings of the IEEE EUROCON, Zagreb, Croatia, 1–4 July 2013.
52. Bhanot, K.; Peddoju, S.K.; Bhardwaj, T. A model to find optimal percentage of training and testing data for efficient ECG analysis
using neural network. Int. J. Syst. Assur. Eng. Manag. 2018, 9, 12–17. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.