EEG emotion recognition article
EEG emotion recognition article
https://www.emerald.com/insight/2210-8327.htm
Abstract
Purpose – The purpose of this study is to propose an alternative efficient 3D emotion recognition model for
variable-length electroencephalogram (EEG) data.
Design/methodology/approach – Classical AMIGOS data set which comprises of multimodal records of
varying lengths on mood, personality and other physiological aspects on emotional response is used for
empirical assessment of the proposed overlapping sliding window (OSW) modelling framework. Two features
are extracted using Fourier and Wavelet transforms: normalised band power (NBP) and normalised wavelet
energy (NWE), respectively. The arousal, valence and dominance (AVD) emotions are predicted using one-
dimension (1D) and two-dimensional (2D) convolution neural network (CNN) for both single and combined
features.
Findings – The two-dimensional convolution neural network (2D CNN) outcomes on EEG signals of AMIGOS
data set are observed to yield the highest accuracy, that is 96.63%, 95.87% and 96.30% for AVD, respectively,
which is evidenced to be at least 6% higher as compared to the other available competitive approaches.
Originality/value – The present work is focussed on the less explored, complex AMIGOS (2018) data set
which is imbalanced and of variable length. EEG emotion recognition-based work is widely available on
simpler data sets. The following are the challenges of the AMIGOS data set addressed in the present work:
handling of tensor form data; proposing an efficient method for generating sufficient equal-length samples
corresponding to imbalanced and variable-length data.; selecting a suitable machine learning/deep learning
model; improving the accuracy of the applied model.
Keywords Electroencephalography (EEG), Emotion recognition (ER),
1D and 2D convolution neural network (CNN)
Paper type Research paper
1. Introduction
Emotions are a manifestation of intuitive states of the mind. They are known to be generated
by events occurring in a person’s environment or internally generated by thoughts [1].
Identification and classification of these emotions using computers have been widely studied
under affective computing and human–computer interface [2].
Emotions are recognised using physiological or non-physiological signals [3].
Electroencephalogram (EEG), electrocardiogram (ECG) [4], galvanic skin response (GSR),
blood volume pulse (BVP) [5] and respiratory suspended particulate (RSP) [6] are popular
© Shruti Garg, Rahul Kumar Patro, Soumyajit Behera, Neha Prerna Tigga and Ranjita Pandey.
Published in Applied Computing and Informatics. Published by Emerald Publishing Limited. This article
is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce,
distribute, translate and create derivative works of this article (for both commercial and non-commercial
purposes), subject to full attribution to the original publication and authors. The full terms of this licence Applied Computing and
Informatics
may be seen at http://creativecommons.org/licences/by/4.0/legalcode Emerald Publishing Limited
First and fifth authors acknowledge FRP grant extended by the University of Delhi under IoE e-ISSN: 2210-8327
p-ISSN: 2634-1964
initiative. DOI 10.1108/ACI-05-2021-0130
ACI tools used in literature to obtain physiological signals, while facial expressions [7], speech [8],
body gestures and videos [9] give non-physiological signals. The advantage of using the
physiological signals for ER is that they are directly captured from human body which gives
true response of human intuitions [10] unlike non-physiological signals that can be
synthetically elicited. Thus, the EEG signals are suitable tool for current research. However,
since EEG signals involve studying human behaviour directly, there is a limitation to
the number of samples that can be collected while deep learning (DL) methods required large
number of samples to work efficiently. Therefore, there is a need for innovative resampling
method to be able to apply DL methods.
The EEG signals are generated by electrical waves corresponding to brain activity
presented by external stimuli [11]. The raw signals need to be pre-processed, and then
appropriate features need to be extracted to get emotions from the signals. Lastly, an efficient
classifier is applied to obtain an appropriate recognition of emotions.
The features of EEG signals were frequently extracted in the time, frequency and time–
frequency domains. The features extracted in the time domain are the Hjorth feature [12],
fractal dimension feature [13] and higher-order crossing feature [14]. The features used in the
frequency domain are power spectral density (PSD) [15], spectral entropy (SE) [16] and
differential entropy [17]. Wavelets and a short-time Fourier transform (STFT) [18] have been
used to extract the time–frequency domain features.
After feature extraction, the machine learning (ML) and DL methods are primarily applied
in literature for classification [19]. The ML methods applied are k-nearest neighbour (KNN),
random forest (RF), decision tree (DT), neural network (NN) and support vector machine
(SVM) for ER. The DL methods used for ER are convolution neural network (CNN), long
short-term memory (LSTM), recurrent neural network (RNN) and several other variants. The
DL methods are found to work with greater accuracy [20]. Table 1 shows a summary of DL
methods applied in recent years.
Apart from these, nature-inspired algorithms have also been applied on ER tasks for
feature selection, such as on the DEAP data set along with particle swarm optimisation (PSO)
[21] and firefly optimisation (FO) [30]. At the same time, LSTM and SVM were used as
classifiers. Feature selection through FO has been known to achieve an accuracy of 86.90%,
while PSO feature selection recorded an accuracy of 84.16%.
Emotions in ER can be classified in two ways: discrete emotions, such as anger, happiness,
sadness, disgust, fear and neutral, and emotion models. There are two types of emotion
models: two-dimensional (2D) [31] and three-dimensional (3D) [32]. The 2D emotion model
consists of valence and arousal; valence represents the measure of pleasant and unpleasant,
and arousal represents excitement and calmness. The 3D emotion model comprises AVD.
The arousal and valence emotions are the same as in the 2D emotion model. Dominance is the
third emotional aspect, representing dependence and independence.
1.1 Contribution
The objective of the present work is to develop an efficient ER model for the AMIGOS [33]
data set in 3D emotional space (i.e. AVD) using DL models. The AMIGOS is a new data set
among other popular EEG data sets for ER. The following are the challenges of the AMIGOS
data set addressed in the present work:
(1) Handling of tensor form data.
(2) Proposing an efficient method for generating sufficient equal-length samples
corresponding to imbalanced and variable-length data.
(3) Selecting a suitable ML/DL model.
(4) Improving the accuracy of the applied model.
Ref., Emotions Feature extraction Accuracy
Emotion
year recognised method Classifier Data sets % recognition
system for
[21], 2D-emotion High-order statistics LSTM SEED 90.81
2020 model EEG signals
[22], Negative, Electrode frequency CNN SEED, DEAP 90.59
2020 positive and distribution
neutral map þ STFT
[23], 3D emotion Multi-level feature Multi-level feature DEAP, DREAMER 98.32
2020 model capsule network capsule network
(end to end network)
[24], Negative, Local and global Regularised graph SEED, SEED-IV 85.30
2020 positive and inter channel neural network
neutral relation
[25], 2D emotion Differential entropy Graph convolution DEAP 90.60
2021 model Network þ LSTM
[26], Sad, happy, Time frequency Configurable CNN, Recorded EEG of 93.01
2020 relax, fear representation by Alexnet, VGG-16, students of Indian
smoothed Pseudo- Resnet-50 Institute of information
Wigner–Ville Technology design and
distribution Manufacturing,
Jabalpur
[27], 2D emotion End-to-end region Region asymmetric DEAP, DREAMER 95
2020 model asymmetric convolution neural
convolution neural network
network
[28], 2D emotion Spectrogram Bidirection LSTM AMIGOS 83.30
2020 model representation
[29], 2D emotion Features extracted CNN þ SVM AMIGOS 90.54 Table 1.
2021 model from topographic Studies on EEG-based
and holographic emotion recognition
feature map using deep learning
The equal-length data samples are generated here by the OSW method. Although the data
can be oversampled using the built-in Python function Synthetic Minority Oversampling
Technique (SMOTE) [34], SMOTE generates the data by replicating the examples without
adding any new information to them. Thus, the OSW method is proposed in the present work,
which induces variability in the sample records by avoiding the repetition of the signals.
Feature extraction is undertaken in two modes using normalised transformation of band
power and a wavelet energy.
The rest of this paper comprises three additional sections. Section 2 provides details of the
emotion recognition system proposed in the research, Section 3 details the results and
discussions and Section 4 provides the conclusions.
Figure 1.
Framework for
overlapping sliding
window-based emotion
recognition system
y0[n] Emotion
VALUE
recognition
1 512
system for
TIME IN SAMPLES EEG signals
y100[n]
VALUE
32
32
32 Figure 2.
Overlapping window
signal decomposition
1 TIME IN SAMPLES 512
b B ¼ PPB
P (3)
B PB
where τ ¼ k: 2−j and s ¼ 2−j represents translation and scale respectively. ψ is called
mother wavelet which was taken here Daubechies4(db4) wavelet. The signal is further
decomposed into cAn and cDn which are called approximation coefficient at nth level
(provides low frequencies) and detailed coefficient at nth level (provides high
frequencies), respectively. Because the EEG signal provided in the pre-processed data
set is in the range of 4 Hz–45 Hz, five-level decomposition is sufficient for required four-
band information, as shown in Figure 3.
After decomposition of signal into multilevel wavelet coefficients, the wavelet energy is
calculated using detailed coefficients cDn of above five levels because the emotion
information is mostly available in higher frequencies. The formula for calculating wavelet
energy is given in Eqn (5):
X
WEn ¼ jcDn j2 (5)
n
Figure 3.
Wavelet
decomposition of
different bands
Figure 4.
1D-CNN architecture
ACI
Figure 5.
2D-CNN architecture
3. Results and discussions Emotion
All experiments conducted in the present work are performed on Intel i5 8 GB RAM AMD recognition
processor using Python 3.7 programming language. PyTorch version 1.7.0 is used to
implement CNN, and the execution of CNN is achieved on Kaggle GPU.
system for
The present work is executed in the following steps: EEG signals
Joined_data 1 3 20 In this array, there are 20 columns corresponding to 16 short videos and 4
long videos shown to the participants. Each cell consists of a matrix of size
y 3 17. Here, y is variable the value of which depends on the length of
video. In this matrix, there are total 17 columns out of which 14 are
corresponding to EEG signals, 2 are corresponding to ECG and last is
corresponding to GSR signal
Labels_self- 1 3 20 In this array, there are 20 columns corresponding to 16 short videos and 4
assessment long videos shown to the participants. Each cell consists of a matrix of size
1 3 12, wherein 12 columns correspond to 12 assessments (arousal,
valence, dominance, liking, familiarity and seven basic emotions) by
participant for every video. The first five dimensions of emotions are
measured on a scale of 1–9, where 1 is the lowest and 9 is the highest. The
seven basic emotions (neutral, disgust, happiness, surprise, anger, fear
and sadness) are displayed in binary (i.e. 0 or 1)
Labels_ext_annotation 1 3 20 In this array, there are 20 columns corresponding to 16 short videos and 4
long videos shown to the participants. Each cell consists of a matrix of size
z 3 3, where z is number of segments in a video each of length 20 seconds Table 2.
and three columns holds the value for segment number, arousal and Content of
valence MATLAB files
High arousal Low arousal High valence Low valence High dominance Low dominance
(HA) 5 1 (LA) 5 0 (HV) 5 1 (LV) 5 0 (HD) 5 1 (LD) 5 0 Table 3.
Coding of AVD from
>4.5 ≤4.5 >4.5 ≤4.5 >4.5 ≤4.5 1–9 to 0–1
ACI
From Table 6, it is evident that the accuracies obtained after resampling by SMOTE are at
least 1% lower than the NOSW and at least 15.87% lower than OSW. Comparing the ML and
DL methods, SVM performs better under SMOTE resampling since sample size is small. But,
in NOSW and OSW resampling, 2D CNN performs best since sample size is large. A feature-
wise comparison of all methods is shown in Figure 7.
From Figure 7(a), it can be observed that SVM outperforms other methods, and no specific
pattern is observed that can indicate which among individual or combined features perform
better. It is evident from 7(b) that the NWE feature is providing higher accuracy in the case of
NOSW. In contrast, the NBP feature is providing higher accuracies with the OSW for DL
methods shown in Figure 7(c). The combined features for 2D CNN give higher accuracies for
both NOSW and OSW shown in Figure 7(b) and 7(c). Thus, by combining the observations
made by Table 6, Figure 7(b) and 7(c), a 2D CNN classifier with a combined feature vector
found best for all the emotional indices with 96.63%, 95.87% and 96.30% accuracies,
respectively.
An execution history of 2D CNN with combined features for overlapping window is shown
in Figure 8 in terms of loss and accuracy curve for arousal, valence and dominance,
respectively. Loss curves represent the training and validation loss, which is expected to be as
close as possible. The accuracy curve shows accuracy obtained for each emotion indices for
20 epochs.
The results are also compared for time of execution of individual feature versus combine
features shown in Figure 9.
Figure 9 shows that as the sample size increases from SMOTE → NOSW → OSW, the time
for execution increases significantly in case of SVM for both individual and combined
ACI Arousal Valence Dominance
80.76
77.88
76.92
74.04
73.11
68.81
68.11
67.89
67.74
67.17
63.99
63.56
63.35
62.35
61.93
61.86
61.22
60.01
60.01
58.87
61.7
58.52
58.09
56.17
55.53
50.49
53.4
NBP NWE {NBP,NWE} NBP NWE {NBP,NWE} NBP NWE {NBP,NWE}
SVM ID-CNN 2D-CNN
81.79
81.14
80.22
80.11
78.67
4
78.64
78.39
76.67
76.27
75.59
75.41
75.19
78.3
73.73
73.13
75.5
71.81
71.37
71.33
71.21
70.09
69.97
69.55
66.56
65.66
62.85
54.15
96.63
95.87
94.22
94.08
93.78
93.66
93.47
93.14
92.93
92.62
96.3
91.51
91.45
90.35
92.3
88.21
88.21
87.05
87.02
89.9
85.95
85.18
84.56
83.22
70.64
63.79
56.25
Figure 7.
Comparison of
NBP NWE {NBP,NWE} NBP NWE {NBP,NWE} NBD NWE {NBP,NWE}
methods performance
(in terms of accuracies) SVM ID-CNN 2D-CNN
features. The reason for this observation is the fact that SVM cannot be executed on GPUs
since it involves complex calculations. As observed in this study that basic SVM performs
poorly when the sample size is large (as in case of combined feature with OSW in Table 6), the
same is being reported in [39].
Table 7 compares the results obtained in present study with ERS articles published from
2018 onwards on AMIGOS data set.
Emotion
recognition
system for
EEG signals
Figure 8.
Loss and accuracy
curve for AVD
ACI
1340
720
120 120
5
4 5
4 5
4 15
10 12
7 12
7 90 90
Figure 9.
SVM 1D CNN 2D CNN SVM 1D CNN 2D CNN SVM 1D CNN 2D CNN
Time of execution of
ML/DL methods for SMOTE NOSW OSW
individual and combined
features (in mins) Inividual Features Combined Features
Emotions
Arousal Valence Dominance
Ref, year % % % Modality Features Classifier
From Table 7, the emotions are recognised using only EEG data in [29, 33, 40, 42]. The other
studies were carried out using multimodal data. The first study [33] conducted on the
AMIGOS data set provides an initial analysis, produces very low accuracy – 57.7% and
therefore poses an open research challenge. The accuracy was improved to 71.54% in [40] in
same year, in which the features were extracted using CNN. A multimodal ERS was proposed
in [28, 41], producing accuracies of up to 84%. Highest accuracy achieved on AMIGOS data
set prior to this work is 90.54% using CNN þ SVM model in [29]. Finally, the present
proposed model has improved the accuracy up to 96.63% with a single modality (EEG)
through a 2D CNN classifier.
Siddharth et al. in 2019 [42] worked on four data sets (DEAP, DREAMER, AMIGOS and
MAHNOB-HCI) using LSTM. However, it is observed by them that LSTM is difficult to
implement on AMIGOS data set, which has varying lengths of data. This indicates necessity Emotion
for executing an efficient pre-processing method prior to classification. The present paper recognition
offers the most efficient classification strategy for EEG records of varying lengths through
decomposition of data using an OSW approach which provides an efficient alternative for
system for
handling imbalanced variable-length data prior to the classification. EEG signals
4. Conclusions
Despite significant development in the field of DL and its suitability to various
applications, almost 59% of researchers have used an SVM with RBF kernels for BCIs
[19]. This is due to the unavailability of a large-scale data set for BCIs. However, DL
models are widely applied in speech and visual modality. A BCI data set provides
genuine human responses as they are taken directly from the human body. Thus, ER
using brain signals is preferred. There is a need for an “off-the-shelf” method to conduct
research on BCIs with a high accuracy. The accuracy found in BCIs is generally low –
especially for the AMIGOS data set.
The present contribution focusses on obtaining predictive outcome of the 3D emotion
responses to EEG signals in context of imbalanced variable-length records. Novelty of the
present paper is that it proposes application of OSW for CNN to the intricate AMIGOS data
set aimed at highly accurate prediction of 3D emotions in contrast to the accuracy achieved by
the existing approaches available in literature. Most of the earlier analysis of AMIGOS data
set has been pivoted on 2D emotion analysis. The current paper views EEG (14 channels) on
3D emotions for predictive inference and presents a comparative assessment of the predictive
accuracy with that of Siddharth et al. (2018) [40]. Thus, the present approach is found to have
the highest accuracy with respect to all the three AVD emotion indices as compared to similar
works referenced in literature (Table 7).
The present work can be further extended for multiple modalities in physiological signals
as well as with the inclusion of response to video interventions such as in automatic video
recommendation system for enhancing the mood of individuals. Another possible extension
of this work can be accomplished by representing the signal features in 2D/3D form and
subsequently combining them with the respective video/image features.
References
1. K€ovecses Z. Emotion concepts. New York: Springer Science and Business Media; 2012 Dec 6.
2. Alarcao SM, Fonseca MJ. Emotion recognition using EEG signals: a survey. IEEE Trans Affect
Comp. 2017 Jun 12; 10(3): 374-93.
3. Saxena A, Khanna A, Gupta D. Emotion recognition and detection methods: a comprehensive
survey. J Art Int Sys. 2020 Feb 7; 2(1): 53-79.
4. Lin YP, Wang CH, Jung TP, Wu TL, Jeng SK, Duann JR, Chen JH. EEG-based emotion recognition
in music listening. IEEE (Inst Electr Electron Eng) Trans Biomed Eng. 2010 May 3; 57(7):
1798-806.
5. Santamaria-Granados L, Munoz-Organero M, Ramirez-Gonzalez G, Abdulhay E, Arunkumar NJ.
Using deep convolutional neural network for emotion detection on a physiological signals dataset
(AMIGOS). IEEE Access. 2018 Nov 23; 7: 57-67.
6. Xiefeng C, Wang Y, Dai S, Zhao P, Liu Q. Heart sound signals can be used for emotion
recognition. Sci Rep. 2019 Apr 24; 9(1): 1.
7. Recio G, Schacht A, Sommer W. Recognizing dynamic facial expressions of emotion: specificity
and intensity effects in event-related brain potentials. Biol Psychol. 2014 Feb 1; 96: 111-25.
8. El Ayadi M, Kamel MS, Karray F. Survey on speech emotion recognition: features, classification
schemes, and databases. Pattern Recogn. 2011 Mar 1; 44(3): 572-87.
ACI 9. Gunes H, Piccardi M. Bi-modal emotion recognition from expressive face and body gestures. J Net
Comp Appl. 2007 Nov 1; 30(4): 1334-45.
10. Song T, Liu S, Zheng W, Zong Y, Cui Z, Li Y, Zhou X. Variational instance-adaptive graph for
EEG emotion recognition. IEEE Trans Affect Comp. 2021 Mar 9, Early access. doi: 10.1109/
TAFFC.2021.3064940.
11. Barlow JS. The electroencephalogram: its patterns and origins. Cambridge, MA and London: MIT
Press; 1993.
12. Yazıcı M, Ulutaş M. Classification of EEG signals using time domain features. 2015 23nd Signal
Processing and Communications Applications Conference (SIU): IEEE; 2015 May 16. 2358-2361.
13. Liu Y, Sourina O. Real-time fractal-based valence level recognition from EEG. Transactions on
computational science; Berlin, Heidelberg: Springer: 2013; 18. p. 101-120.
14. Petrantonakis PC, Hadjileontiadis LJ. Emotion recognition from EEG using higher order
crossings. IEEE Trans Inf Tech Biomed. 2009 Oct 23; 14(2): 186-97.
15. Kim C, Sun J, Liu D, Wang Q, Paek S. An effective feature extraction method by power spectral
density of EEG signal for 2-class motor imagery-based BCI. Med Biol Eng Comput. 2018 Sep;
56(9): 1645-58.
16. Zhang R, Xu P, Chen R, Li F, Guo L, Li P, Zhang T, Yao D. Predicting inter-session performance of
SMR-based brain–computer interface using the spectral entropy of resting-state EEG. Brain
Topogr. 2015 Sep; 28(5): 680-90.
17. Zhang J, Wei Z, Zou J, Fu H. Automatic epileptic EEG classification based on differential entropy
and attention model. Eng Appl Artif Intelligence. 2020 Nov 1; 96: 103975.
18. Al-Fahoum AS, Al-Fraihat AA. Methods of EEG signal features extraction using linear analysis
in frequency and time-frequency domains. Int Scholarly Res Notices. 2014: 1-7.
19. Gu X, Cao Z, Jolfaei A, Xu P, Wu D, Jung TP, Lin CT. EEG-based brain-computer interfaces
(BCIS): a survey of recent studies on signal sensing technologies and computational intelligence
approaches and their applications. IEEE ACM Trans Comput Biol Bioinf. 2021 Jan 19. doi: 10.
1109/TCBB.2021.3052811.
20. Tao W, Li C, Song R, Cheng J, Liu Y, Wan F, Chen X. EEG-based emotion recognition via channel-
wise attention and self attention. IEEE Trans Affect Com. 2020 Sep 22. doi: 10.1109/TAFFC.2020.
3025777.
21. Sharma R, Pachori RB, Sircar P. Automated emotion recognition based on higher order statistics
and deep learning algorithm. Bio Sig Pro Cont. 2020 Apr 1; 58: 101867.
22. Wang F, Wu S, Zhang W, Xu Z, Zhang Y, Wu C, Coleman S. Emotion recognition with
convolutional neural network and EEG-based EFDMs. Neuropsychologia. 2020 Sep 1; 146: 107506.
23. Liu Y, Ding Y, Li C, Cheng J, Song R, Wan F, Chen X. Multi-channel EEG-based emotion
recognition via a multi-level features guided capsule network. Comput Biol Med. 2020 Aug 1; 123:
103927.
24. Zhong P, Wang D, Miao C. EEG-based emotion recognition using regularized graph neural
networks. IEEE Trans Affect Comp. 2020 May 11. doi: 10.1109/TAFFC.2020.2994159.
25. Yin Y, Zheng X, Hu B, Zhang Y, Cui X. EEG emotion recognition using fusion model of graph
convolutional neural networks and LSTM. Appl Soft Comput. 2021 Mar 1; 100: 106954.
26. Khare SK, Bajaj V. Time-frequency representation and convolutional neural network-based
emotion recognition. IEEE Trans Neural Net Lear Sys. 2020 Jul 31; 32(7): 2901-2909.
27. Cui H, Liu A, Zhang X, Chen X, Wang K, Chen X. EEG-based emotion recognition using an end-to-
end regional-asymmetric convolutional neural network. Knowl Base Syst. 2020 Oct 12; 205:
106243.
28. Li C, Bao Z, Li L, Zhao Z. Exploring temporal representations by leveraging attention-based
bidirectional LSTM-RNNs for multi-modal emotion recognition. Inf Process Management. 2020
May 1; 57(3): 102185.
29. Topic A, Russo M. Emotion recognition based on EEG feature maps through deep learning Emotion
network. Eng Sci Tech Int J. 2021 Apr 16. doi: 10.1016/j.jestch.2021.03.012.
recognition
30. He H, Tan Y, Ying J, Zhang W. Strengthen EEG-based emotion recognition using firefly
integrated optimization algorithm. Appl Soft Comput. 2020 Sep 1; 94: 106426.
system for
31. Russell JA. A circumplex model of affect. J Personal Soc Psychol. 1980 Dec; 39(6): 1161.
EEG signals
32. Verma GK, Tiwary US. Affect representation and recognition in 3D continuous valence–arousal–
dominance space. Multimed Tool Appl. 2017 Jan; 76(2): 2159-83.
33. Correa JA, Abadi MK, Sebe N, Patras I. Amigos: a dataset for affect, personality and mood
research on individuals and groups. IEEE Trans Affect Com. 2018 Nov 30; 12(2): 479-493.
34. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling
technique. J Artif intelligence Res. 2002 Jun 1; 16: 321-57.
35. Angelov P, Sperduti A. Challenges in deep learning. In: ESANN 2016 - 24th European symposium
on artificial neural networks. ESANN 2016 - 24th European symposium on artificial neural
networks. i6doc.com publication, BEL; 2016. 489-496. ISBN 9782875870278.
36. Bracewell RN, Bracewell RN. The Fourier transform and its applications. New York, NY:
McGraw-Hill; 1986 Feb.
37. Daubechies I. Ten lectures on wavelets, CBMS conf. Series Appl Math; 1992 Jan 1; 61.
38. Le QV. A tutorial on deep learning part 2: autoencoders, convolutional neural networks and
recurrent neural networks. Google Brain. 2015 Oct 20: 1-20 [online] Available from: https://cs.
stanford.edu/~quocle/tutorial2.pdf.
39. Cervantes J, Li X, Yu W, Li K. Support vector machine classification for large data sets via
minimum enclosing ball clustering. Neurocomputing. 2008 Jan 1; 71(4–6): 611-9.
40. Siddharth, Jung TP, Sejnowski TJ. Multi-modal approach for affective computing. 2018 40th
Annual International Conference of the IEEE Engineering in Medicine and Biology Society
(EMBC); IEEE; 2018 Jul 18. 291-294.
41. Tung K, Liu PK, Chuang YC, Wang SH, Wu AY. Entropy-assisted multi-modal emotion
recognition framework based on physiological signals. 2018 IEEE-EMBS Conference on
Biomedical Engineering and Sciences (IECBES); IEEE: 2018 Dec 3. 22-26.
42. Siddharth S, Jung TP, Sejnowski TJ. Utilizing deep learning towards multi-modal bio-sensing and
vision-based affective computing. IEEE Trans Affect Com. 2019 May 14. doi: 10.1109/TAFFC.
2019.2916015.
Annexure
Annexure is available online for this article.
Corresponding author
Shruti Garg can be contacted at: gshruti@bitmesra.ac.in
For instructions on how to order reprints of this article, please visit our website:
www.emeraldgrouppublishing.com/licensing/reprints.htm
Or contact us for further details: permissions@emeraldinsight.com