Emotion Recognition With Machine Learning Using EEG Signals: Omid Bazgir Zeynab Mohammadi Seyed Amir Hassan Habibi

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Emotion Recognition with Machine Learning Using

EEG Signals
Omid Bazgir*
Zeynab Mohammadi Seyed Amir Hassan Habibi
Department of Electrical and
Department of Electrical and Department of Neurology of Rasool
Computer Engineering
Computer Engineering Akram Hospital
Texas Tech University
University of Tabriz Iran University of Medical Sciences
Lubbock, Texas, USA
Tabriz, Iran Tehran, Iran
Email: omid.bazgir@ttu.edu

Abstract—In this research, an emotion recognition system is attributes of the classifiers. A variety of features have been
developed based on valence/arousal model using extracted from the time domain [4], frequency domain [5], and
electroencephalography (EEG) signals. EEG signals are joint time-frequency domain [6] from EEG signals, in
decomposed into the gamma, beta, alpha and theta frequency intelligent emotion recognition systems. The wavelet
bands using discrete wavelet transform (DWT), and spectral transform is able to decompose signals in specific frequency
features are extracted from each frequency band. Principle bands with minimum time-resolution loss. To this end,
component analysis (PCA) is applied to the extracted features choosing an appropriate mother wavelet is crucial.
by preserving the same dimensionality, as a transform, to make Murugappan [7] considered four different mother wavelets,
the features mutually uncorrelated. Support vector machine
namely ‘db4’, ‘db8’, ‘sym8’ and ‘coif5’ to extract the
(SVM), K-nearest neighbor (KNN) and artificial neural
network (ANN) are used to classify emotional states. The cross-
statistical features, including standard deviation, power and
validated SVM with radial basis function (RBF) kernel using entropy from EEG signal. The KNN classifier was employed
extracted features of 10 EEG channels, performs with 91.3% to classify five categories of emotions (disgust, happy,
accuracy for arousal and 91.1% accuracy for valence, both in surprise, fear, and neutral). The supreme accuracy rate was
the beta frequency band. Our approach shows better about 82.87% on 62 channels and 78.57% on 24 channels.
performance compared to existing algorithms applied to the Nasehi and Pourghasem [8], applied Gabor function and
“DEAP” dataset. wavelet transform to extract spectral, spatial and temporal
features from four EEG channels. The artificial neural
Keywords: Emotion, Machine Learning, Valence-arousal, network (ANN) classifier was used for classification of six
EEG, DWT, PCA, SVM, KNN. kinds of emotions (happiness, surprise, anger, fear, disgust
and sadness), with 64.78% accuracy. Ishino and Hagiwara
I. INTRODUCTION extracted mean and variance from power spectra density
Emotion states are associated with wide variety of human (PSD), wavelet coefficients of EEG data. Then, a neural
feelings, thoughts and behaviors; hence, they affect our ability network was trained based on the principal components of the
to act rationally, in cases such as decision-making, perception features to classify four types of emotion (joy, sorrow,
and human intelligence. Therefore, studies on emotion relaxation and anger) with 67.7% classification rate.
recognition using emotional signals enhance the brain- Mohammedi et al [9], extracted spectral features including
computer interface (BCI) systems as an effective subject for energy and entropy of wavelet coefficients from 10 EEG
clinical applications and human social interactions [1]. channels. The maximum classification accuracy using KNN
Physiological signals are being used to investigate emotional was 84% for arousal and 86% for valence. Jie et al [10]
states while considering natural aspects of emotions to employed Kolmogorov-Smirnov (K-S) test to select an
elucidate therapeutics for psychological disorders such as appropriate channel for extracting sample entropy as a feature
autism spectrum disorder (ASD), attention deficit and used it as input of an SVM. The maximum accuracy of
hyperactivity disorder (ADHD) and anxiety disorder [2]. In Jie’s method is 80.43% and 71.16% respectively for arousal
recent years, developing emotion recognition systems based and valence. Ali et al [11], combined wavelet energy, wavelet
on EEG signals have become a popular research topic among entropy, modified energy and statistical features of EEG
cognitive scientists. signals, using three classifiers including; SVM, KNN, and
quadratic discriminant analysis (QDA), to classify emotion
To design an emotion recognition system using EEG states. The overall obtained classification accuracy of Alie’s
signals, effective feature extraction and optimal classification method was 83.8%.
are the main challenges. EEG signals are non-linear, non-
stationary, buried into various sources of noise and are random Different theories and principles have been proposed by
in nature [3]. Thus, handling and extracting meaningful experts in psychology and cognitive sciences about defining
features from EEG signals plays a crucial role in an effective and discriminating emotional states, such as being cognitive
designing of an emotion recognition system. Extracted or non-cognitive. In this heated debate that continues to go
features quantify the EEG signals and then are used as on, there is one claim that states that instead of classifying
emotions into boundless common, basic, types of emotions
such as; happiness, sadness, anger, fear, joy so on and so forth, B. Channel selection
emotion can be classified based on valence-arousal model According to Coan et al. [15], positive and negative
introduced the rooted brain connectivity [12, 13]. The emotions are respectively associated with left and right frontal
valence-arousal model is a bi-dimensional model which brain regions. They have shown that, the brain activity
inculdes four emotional states; high arousal high valence decreases more in the frontal region of the brain as compared
(HAHV), high arousal low valence (HALV), low arousal high to other regions. Therefore, the following channels were
valence (LAHV) and low arousal low valence (LALV) [1]. selected to investigate in this study: F3-F4, F7-F8, FC1-FC2,
Therefore, each of the common emotional states can be FC5- FC6, and FP1- FP2.
modeled and interpreted based on valence-arousal model,
figure 1. C. Preprocessing
To reduce the electronic amplifier, power line and
external interference noise, the average mean reference
(AMR) method was utilized. For each selected channel, the
mean is calculated and subtracted from every single sample
of that channel. To reduce the individual difference effect, all
the values were normalized between [0, 1].
D. Feature extraction
Due to DWT effective multi-resolution capability in
analysis of non-stationary signals, we followed our previous
work [9] for the feature extraction, in which, DWT was
applied on the windowed EEG signals of the selected
channels. The EEG signals are windowed due to increasing
possibility of the quick detection of the emotional state. Thus,
the 4- and 2-seconds temporal windows with 50% overlap
were chosen. The EEG signals are decomposed into 5
different bands, including; theta (4-8 Hz), alpha (8-16 Hz),
beta (16-32 Hz), gamma (32-64 Hz) and noises (> 64 Hz) via
Figure 1. Interpretation of different emotions based on valence-
arousal model [12].
db4 mother wavelet function. Afterwards, the entropy and
energy were extracted from each window of every frequency
In this paper, an emotion recognition system is developed band.
based on the valence-arousal model. We applied wavelet Entropy is a measurement criterion of the amount of
transform on EEG signals, then energy and entropy were information within the signal. The entropy of signal over a
extracted from 4 different decomposed frequency bands. PCA temporal window within a specific frequency band is
was used to make the attributes uncorrelated. SVM, KNN and computed as:
ANN are utilized for emotional states classification into 𝑁

arousal/valence dimension. 𝐸𝑁𝑇𝑗 = − ∑(𝐷𝑗 (𝑘)2 ) log(𝐷𝑗 (𝑘)2 ) (1)


𝑘=1
II. METHODOLOGY By summing the square of the wavelet coefficients over
A. Data Acquisition temporal window, the energy for each frequency band is
In this study, the DEAP database, a database for emotion computed:
𝑁
analysis using physiological signals, labeled based on
valence-arousal-dominance emotion model, is used [14]. 𝐸𝑁𝐺𝑗 = ∑(𝐷𝑗 (𝑘)2 ) 𝑘 = 1,2, … , 𝑁. (2)
DEAP dataset includes 32 participants. To stimulate the 𝑘=1
auditory and visual cortex, 1-min long music videos were Where j is the wavelet decomposition level (frequency
played for each participant. 40 music videos were shown to band), and k is the number of wavelet coefficients within the
each participants, and seven different modalities were j frequency band.
recorded, EEG is used in this study, more information is E. Principal component analysis
provided in [14]. The 40 video clips were pre-determined so
that their valence/arousal time length would be large enough PCA is an eigenvector-based statistical mechanism,
in valence/arousal scope. Each participant was asked to grade which employs singular value decomposition, that transforms
each music video from 1 to 9 for valence, arousal, dominance a set of correlated features into mutually uncorrelated
and liking. Hence, if the grade was greater than 4.5 then the features [16], principal components or PCs. If we define the
arousal/valence label is high, if the grade is less than 4.5 then extracted training features as 𝑋, then the PCA coefficients (Z)
the arousal/valence label is low [14]. All the signals were can be written as:
recorded with 512 Hz sampling frequency. 𝑍 = 𝑋𝜙 (3)
Where 𝜙 is the training set underlying basis vector, then
the underlying basis vector is applied to the extracted features
of the test set, 𝑋̂, to obtain principal components of the test on the remaining fold, the process repeated eight times, each
set, as: time a different fold is selected for testing.
𝑍̂ = 𝑋̂𝜙 (4)
III. RESULTS
The PCA was applied on the stacked extracted features,
without any dimensionality reduction, to generate PCs. The We trained SVM, ANN and KNN classifiers, with
PCs are used as the input vectors of the three classifiers extracted features from a pair of channels (F3-F4, F7-F8,
addressed in the next section. FC1-FC2, FC5-FC5, FP1-FP2) of all frequency bands in 2-
and 4-seconds temporal window. The RBF kernel of SVM is
F. Classification implemented with 𝜎 = 2 . The table 1 shows the cross-
In this research, kernel SVM, KNN, same as [9], and ANN validated accuracy of the classifiers with each pair of
are used for classification with the eight-fold cross- channels. Optimum accuracy yields with F3-F4 pair of EEG
validation. The goal of SVM, as a parametric classifier, is to channels, using SVM, which is 90.8 % (sensitivity 88.18%,
formulate a separating hyperplane with application of solving specificity 89.27%) for arousal and 90.6% (sensitivity 89.9%,
a quadratic optimization problem in the feature space [16]. specificity 87.7%) for valence. Therefore, to reduce the
Kernel SVM finds the optimum hyperplane into a higher computational cost, F3-F4 pair of channels can be utilized.
dimensional space, that maximizes the generalization
capability, where the distance between margins is maximum. Table 1. Cross-validated accuracy of classifiers from each pair of
The RBF kernel is a function which projects input vectors EEG channels in temporal windows of (a) 4 (b)2 s
into a gaussian space, using equation (3). The generalization
property makes kernel SVM insensitive to overfitting [17]. Cross-validated Channels
Accuracy (%) F3- F7- FC1- FC5- FP1-
′) ′ ‖2
F4 F8 FC2 FC6 FP2
𝐾𝑅𝐵𝐹 (𝑥, 𝑥 = exp[−𝜎‖𝑥 − 𝑥 ] (5) (a)
Arousal-SVM 90.8 87.9 87.2 85.1 88.1
KNN is a non-parametric instance-based classifier, which Valence-SVM 90.6 84.9 89.8 85.5 88.5
classifies an object based on the majority of votes of its Arousal-KNN 76.4 79 75.9 77.6 71.5
neighbors. The votes are being assigned by the K-nearest Valence-KNN 79 80.4 77 75.5 73.6
neighbors’ distance to the object. Arousal-ANN 82.1 83.2 71.1 77.7 80.4
ANN is a semi-parametric classifier flexible for non- Valence-ANN 84.7 83.1 73.5 74.4 78.9
(b)
linear classification, which aggregates multilayer logistic
Arousal-SVM 85.7 79.7 80.3 83.1 82.3
regressions [18]. For multilayer feedforward network the Valence-SVM 84.8 81.2 82.2 82.8 83.2
nonlinear activation function of the output layer is a sigmoid, Arousal-KNN 68.5 70.2 65.3 66.6 64.5
equation 4: Valence-KNN 69 68 68.2 69.9 63.5
1 Arousal-ANN 74.3 76.2 73.6 70.2 73.4
𝜙(𝑥) = (6)
1 + 𝑒 −𝑥 Valence-ANN 71.2 79.2 69.4 68.5 74.6
Which predicts the probability as an output. In the hidden
layers rectified linear unit (ReLu) activation function,
equation 5, is employed. The ReLu provides sparsity and a Using all the features extracted from all the channels
reduced likelihood of vanishing gradient, therefore the within each frequency band (gamma, beta, alpha, theta), the
network convergence is improved and the learning process SVM, ANN and KNN are trained. The cross-validated
including backpropagation is faster. accuracy of each classifier trained on the principal
components of each band is shown in table 2. The trained
𝜙(𝑥) = max(0, 𝑥) (7) SVM classifier with beta frequency band features generates
the optimum accuracy, 91.3% (sensitivity 89.1%, specificity
The implemented ANN incorporates three layers, 88.4%) for arousal and 91.1% (sensitivity 87.3%, specificity
including two hidden layers with ReLu activation function 86.8%) for valence.
and the output layer with sigmoid activation function.
IV. DISCUSSION AND CONCLUSION
Radial basis function (RBF) is implemented as the SVM
kernel, with two different scale factors, 𝜎 = 2, 𝜎 = 0.1 . In this research, EEG signals from the DEAP dataset are
KNN is implemented with five different values of nearest windowed into 2- and 4-seconds windows with 50% overlap,
neighbors ( 3 ≤ 𝐾 ≤ 7 ), and the optimum classification then, decomposed into 5 frequency bands; gamma, beta,
accuracy is achieved with K=5. alpha, theta and noise. Spectral features, entropy, and energy
The SVM, ANN and KNN are implemented with eight- of each window within the pre-specified frequency band are
fold cross validation to estimate the average accuracy of each extracted. Afterwards, the PCA is applied as a transform
classifier. It partitions the data randomly into eight folds, with preserving the input dimensionality to produce uncorrelated
equivalent size, each fold includes four participants attributes. The ANN, KNN and SVM are trained with
attributes. The learner was trained on seven folds and tested attributes from different frequency bands, and different pair
of channels, by eight-fold cross validation.
Table 2. Cross-validated accuracy of classifiers from different robust classifiers such as random forest, deep neural network,
frequency bands in temporal windows of (a) 4 (b) 2 s or recurrent neural network may also lead to higher cross-
validated accuracy.
Cross-validated Frequency Bands
Accuracy (%) Gamma Beta Alpha Theta
(a) ACKNOWLEDGMENT
Arousal-SVM 89.8 91.3 90.4 89.4
Valence-SVM 89.7 91.1 90.9 89.1 The authors are extremely grateful to Dr. Mahmood Amiri
Arousal-KNN 75 75.1 72.6 73.8 of the Kermanshah University of Medical Sciences, and Javad
Valence-KNN 77 78.1 78 76.1 Frounchi from the department of Electrical and Computer
Arousal-ANN 81.3 86.5 83.5 80.7 Engineering of University of Tabriz for their support and
Valence-ANN 84.8 88.3 87.2 82.6 guidance.
(b)
Arousal-SVM 81.4 89 88.68 86.6 V. REFERENCES
Valence-SVM 82.43 89.34 89.72 86.1
Arousal-KNN 76 73 73.7 75.6 [1] Y. Zhang, S. Zhang, and X. Ji, "EEG-based classification of
Valence-KNN 74.3 73.2 73.8 72.4 emotions using empirical mode decomposition and
Arousal-ANN 83.2 78.4 82.7 81.9 autoregressive model," Multimedia Tools and Applications, pp.
Valence-ANN 79.1 75.2 86.1 84.3 1-14, 2018.
[2] L. Santamaria-Granados, M. Munoz-Organero, G. Ramirez-
Gonzalez, E. Abdulhay, and N. J. I. A. Arunkumar, "Using Deep
It is evident from comparing the cross-validated accuracy Convolutional Neural Network for Emotion Detection on a
results shown in table 1 and 2, that the SVM classifier Physiological Signals Dataset (AMIGOS)," 2018.
outperforms KNN and ANN. In the previous work [9], the [3] X. Li, D. Song, P. Zhang, G. Yu, Y. Hou, and B. Hu, "Emotion
maximum classification accuracy was 86.75% using KNN. In recognition from multi-channel EEG data through convolutional
recurrent neural network," in Bioinformatics and Biomedicine
this study, applying PCA made the extracted features (BIBM), 2016 IEEE International Conference on, 2016, pp. 352-
uncorrelated. In addition, picking appropriate kernel (RBF), 359: IEEE.
greatly increased the cross-validated accuracy. [4] K. Takahashi, "Remarks on SVM-based emotion recognition
Comparing the accuracy rates of tables 1 and 2 shows that, from multi-modal bio-potential signals," in Robot and Human
Interactive Communication, 2004. ROMAN 2004. 13th IEEE
using a 4 seconds window ends up to higher accuracy, International Workshop on, 2004, pp. 95-100: IEEE.
therefore, the features are more meaningful. It is also [5] X.-W. Wang, D. Nie, and B.-L. Lu, "EEG-based emotion
noticeable that emotion status discrimination, based on recognition using frequency domain features and support vector
valence/arousal area, in different frequency bands is higher machines," in International Conference on Neural Information
Processing, 2011, pp. 734-743: Springer.
than the channels. The minimum accuracy in table 2 (a), for [6] R.-N. Duan, J.-Y. Zhu, and B.-L. Lu, "Differential entropy
valence/arousal SVM is 89.1%, whereas in table 1 (a), for feature for EEG-based emotion classification," in Neural
valence/arousal SVM the maximum accuracy is 90.8%. Engineering (NER), 2013 6th International IEEE/EMBS
Therefore, it is better to extract features from all the channels Conference on, 2013, pp. 81-84: IEEE.
[7] M. Murugappan, "Human emotion classification using wavelet
in each frequency band, then train a classifier. transform and KNN," in Pattern analysis and intelligent robotics
Table 3 shows a comparison between our study results and (ICPAIR), 2011 international conference on, 2011, vol. 1, pp.
other researchers that have conducted research on the DEAP 148-153: IEEE.
dataset. [8] S. Nasehi, H. Pourghassem, and I. Isfahan, "An optimal EEG-
based emotion recognition algorithm using gabor," WSEAS
Transactions on Signal Processing, vol. 3, no. 8, pp. 87-99, 2012.
Table 3. Accuracy comparison of different studies on DEAP [9] Z. Mohammadi, J. Frounchi, and M. Amiri, "Wavelet-based
dataset emotion recognition system using EEG signal," Neural
Computing and Applications, vol. 28, no. 8, pp. 1985-1990, 2017.
Reference Year Number of Classifier Accuracy [10] X. Jie, R. Cao, and L. Li, "Emotion recognition based on the
Channels sample entropy of EEG," Bio-medical materials and engineering,
[10] 2014 5 SVM 80.4% vol. 24, no. 1, pp. 1185-1192, 2014.
[11] M. Ali, A. H. Mosa, F. Al Machot, and K. Kyamakya, "EEG-
[11] 2016 -- SVM 83.8%
based emotion recognition approach for e-healthcare
[9] 2017 10 KNN 86.7 % applications," in Ubiquitous and Future Networks (ICUFN),
Our 2018 10 SVM 91.3% 2016 Eighth International Conference on, 2016, pp. 946-950:
study IEEE.
[12] R. Horlings, D. Datcu, and L. J. Rothkrantz, "Emotion
One can increase the accuracy with ensemble learning, recognition using brain activity," in Proceedings of the 9th
international conference on computer systems and technologies
using the trained SVM and extracted features in different and workshop for PhD students in computing, 2008, p. 6: ACM.
frequency bands, which makes the algorithm time- [13] T. Dalgleish and M. Power, Handbook of cognition and emotion.
consuming or somewhat inappropriate for real-time John Wiley & Sons, 2000.
applications, such as self-driving cars. [14] S. Koelstra et al., "Deap: A database for emotion analysis; using
physiological signals," IEEE Transactions on Affective
For future works, other transforms such as independent Computing, vol. 3, no. 1, pp. 18-31, 2012.
component analysis (ICA) or linear discriminant analysis [15] J. A. Coan, J. J. Allen, and E. Harmon-Jones, "Voluntary facial
(LDA) could be applied, on the extracted features. Other expression and hemispheric asymmetry over the frontal cortex,"
Psychophysiology, vol. 38, no. 6, pp. 912-925, 2001.
[16] S. Theodoridis, A. Pikrakis, K. Koutroumbas, and D. Cavouras,
Introduction to pattern recognition: a matlab approach.
Academic Press, 2010.
[17] O. Bazgir, S. A. H. Habibi, L. Palma, P. Pierleoni, and S. Nafees,
"A Classifcation System for Assessment and Home Monitoring
of Tremor in Patients with Parkinson’s Disease," Journal of
Medical Signals and Sensors, vol. 8, no. 2, 2018.
[18] O. Bazgir, J. Frounchi, S. A. H. Habibi, L. Palma, and P.
Pierleoni, "A neural network system for diagnosis and
assessment of tremor in Parkinson disease patients," in
Biomedical Engineering (ICBME), 2015 22nd Iranian
Conference on, 2015, pp. 1-5: IEEE.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy