A Novel Neural Network Model Based On Cerebral Hemispheric Asymmetry For EEG Emotion Recognition
A Novel Neural Network Model Based On Cerebral Hemispheric Asymmetry For EEG Emotion Recognition
A Novel Neural Network Model Based On Cerebral Hemispheric Asymmetry For EEG Emotion Recognition
Yang Li1,2 , Wenming Zheng1,∗ , Zhen Cui3 , Tong Zhang1,2 and Yuan Zong1
1
Key Laboratory of Child Development and Learning Science of Ministry of Education,
Southeast University, China
2
School of Information Science and Engineering, Southeast University, China
3
School of Computer Science and Engineering,
Nanjing University of Science and Technology, China
wenming zheng@seu.edu.cn
1561
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)
1562
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)
1563
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)
(3) Discriminative prediction. Like most supervised mod- Algorithm 1 Optimization of BiDANN.
els, we add a supervision term into the network so as Input:
to enhance the model’s discriminability. Concretely, we Training data set XS and Testing data set XT ;
use softmax function on the transformed hidden states to Ground-truth label set LS of training data set;
predict the class labels, i.e., Training (source) domain label set DS =[DSl , DSr ]={0}
T T
qi = [hl1S , · · · , hlKS , hr1S T , · · · , hrKS T ]T , (9) and testing (target) domain label set DT = [DTl , DTr ] =
{1};
exp(Gqi + b) Initial learning rate α;
P (yi = c|qi , G, b) = P , (10)
k exp(Gqk + b) Output:
yei = arg max P (yi = c|qi , G, b), (11) Parameter: θ̂fl , θ̂fr , θ̂c , θ̂d , θ̂dl , θ̂dr .
1: Input XS and LS to update the parameters of Classifier:
where qi ∈ IR2Kdh ×1 , the variables G ∈ IRdL ×2Kdh and
θc ← θc − α ∂L l l ∂Lc r r
∂θc , θf ← θf − α ∂θ l , θf ← θf − α ∂θ r ;
c ∂Lc
b ∈ IRdL ×1 are respectively the transform matrix and f f
bias, c is the c-th class, yi is the ground-truth label of 2: Input XS , XT , DS and DT to update the parameters of
i-th training data, dL is the number of class. The loss global Discriminator:
function of class label prediction can be expressed as: θd ← θd − α ∂L l l ∂Ld r r ∂Ld
∂θd , θf ← θf + α ∂θ l , θf ← θf + α ∂θ r ;
d
f f
Lc (XS ; θfl , θfr , θc ) = L(Gc (Ef (XS ; θfl , θfr ); θc ), yi ) 3: Input XlS , XlT , DS
l
and DTl to update the parameters of
X left hemispheric local Discriminator:
=− log(P (e
yi = c|qi , G, b), (12) ∂Ll ∂Ll
θdl ← θdl − α ∂θld , θfl ← θfl + α ∂θld ;
t d f
4: Input XrS , XrT , DS
r
and DTr to update the parameters of
where Gc denotes the class label classifier of source do-
right hemispheric local Discriminator:
main. ∂Lr ∂Lr
θdr ← θdr − α ∂θrd , θfr ← θfr + α ∂θrd ;
d f
2.2 Optimization of BiDANN 5: If algorithm has scanned all data 100 times, then α ←
Through minimizing Lc and maximizing Lld , Lrd , Ld , we op- 0.9 × α and goto step1;
timize the objective function of Eq. (1) to achieve a saddle 6: return θ̂fl , θ̂fr , θ̂c , θ̂d , θ̂dl , θ̂dr .
point by:
(θ̂fl , θ̂fr , θ̂c ) = arg min L(XR ; (θfl , θfr , θc ), θ̂d , θ̂dl , θ̂dr ),(13)
θfl ,θfr ,θc
θ̂d = arg max L(XR ; θ̂fl , θ̂fr , θ̂c , θd , θ̂dl , θ̂dr ), (14)
θd
θ̂dl = arg max L(XlR ; θ̂fl , θ̂fr , θ̂c , θ̂d , θdl , θ̂dr ), (15)
θdl
1564
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)
1565
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)
Method ACC/STD(%)
SVM [Suykens and Vandewalle, 1999] 56.73/16.29
KPCA [Schölkopf et al., 1998] 61.28/14.62
TCA [Pan et al., 2011] 63.64/14.88
T-SVM [Collobert et al., 2006] 72.53/14.00
TPT [Sangineto et al., 2014] 76.31/15.89
DANN [Ganin et al., 2016] 75.08/11.18 (a) The conventional EEG emo- (b) The personalized EEG emo-
tion recognition experiment. tion recognition experiment.
BiDANN-R1 76.97/11.08
BiDANN-R2 82.22/07.61 Figure 5: The confusion matrices in our experiments.
BiDANN 83.28/09.60
Table 2: The mean accuracies (ACC) and standard deviations (STD) 3.3 Confusion Matrix
on SEED database for personalized EEG emotion recognition exper- To see the results of recognizing each emotion, we depict the
iment. confusion matrices corresponding to the experimental results
of our BiDANN. Fig. 5 shows the confusion matrices of con-
ventional and personalized EEG emotion recognition exper-
iments on SEED database respectively. From these two fig-
emotion, we can see that using the right hemispheric data has
ures, we can obtain two observations:
a much better performance than the left ether in DANN or
BiDANN-R1, which shows that the right hemisphere can bet- (1) Our BiDANN method performs well in recognizing all
ter process negative emotion than left hemisphere does. For three types of emotion, especially the positive emotion
positive emotion, the performance of left hemispheric EEG as the accuracies are more than 90% either in conven-
data approximates that of right hemispheric data in the experi- tional or personalized EEG emotion recognition tasks.
mental result of DANN method, and in BiDANN-R1 method, This shows that there indeed exists similarities in the
the performance of left hemispheric EEG data improves 3% same emotion of EEG signal. It is efficient to use the
compared with right hemispheric data, which shows that the EEG emotion signal to decode human emotion.
left hemisphere can better process positive emotion than right
hemisphere. (2) The mean accuracies of three types of emotion in all
subjects are negative 86.15%, neutral 93.61%, positive
Personalized (Subject-Independent) EEG Emotion Recognition 96.89% in conventional EEG emotion recognition task
from Fig. 5(a) and negative 80.51%, neutral 74.51%,
In this experiment, we adopt a leave-one-subject-out cross positive 91.04% in personalized EEG emotion recogni-
validation strategy to evaluate the performance of our model, tion from Fig. 5(b). We can observe that positive emo-
which is same with the protocol of Zheng et al. [Zheng and tion is much easier than negative and neutral emotions
Lu, 2016]. This strategy takes one subject’s EEG as the test- to be recognized. In addition, negative and neutral emo-
ing data while the rest 14 subjects’ EEG as training data. We tions are much more likely to be confused than positive
calculate the mean accuracy of 15 times experiments as the emotion. Maybe the positive emotion stimulus materials
evaluation criterion. cause more resonance in participants.
Here we compare our BiDANN with linear SVM [Suykens
and Vandewalle, 1999], KPCA [Schölkopf et al., 1998], 4 Conclusion
TCA [Pan et al., 2011], T-SVM [Collobert et al., 2006],
TPT [Sangineto et al., 2014] and the baseline methods, i.e., Emotion is a basic and common phenomenon which exists
DANN, BiDANN-R1 and BiDANN-R2. TCA and KPCA are in every human being. The technology of EEG provides a
infeasible to include all the training EEG data due to limits direct means to study emotion by measuring the signal of
of memory and time cost for singular value decomposition. nerve activity in brain. EEG emotion recognition models
Thus in the experiment, we use the randomly selected 5000 should consider the neurophysiology nature of brain and the
samples as the training data for TCA and KPCA. statistics characteristics of EEG signal. In this paper, we uti-
lize cerebral hemispheric asymmetry to deal with EEG emo-
Table 2 shows the performance on SEED database. We can tion recognition problem and propose a novel EEG emotion
see that BiDANN-R2 improves 5.2 percent compared with recognition framework called BiDANN. BiDANN first ex-
BiDANN-R1, which shows the importance of considering the tracts time dynamic features of left and right hemispheric
discrepancy between left and right cerebral hemispheric data EEG data separately and then narrows the distribution gap be-
for EEG emotion recognition. Furthermore, BiDANN with tween training and testing data by using local and global dis-
local discriminators has a better performance than BiDANN- criminators. The experimental results show that our BiDANN
R2 about 1 percent. This reveals that local discriminators are is superior to the baselines even other deep learning methods.
useful to further narrow the distribution difference between In the future work, we will investigate the effect of hemi-
source and target domains on both hemispheres. sphere data on more types of emotion.
1566
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI-18)
Acknowledgments [Pan et al., 2011] Sinno Jialin Pan, Ivor W Tsang, James T
This work was supported by the National Basic Research Kwok, and Qiang Yang. Domain adaptation via transfer
Program of China under Grant 2015CB351704, the Na- component analysis. IEEE Transactions on Neural Net-
tional Natural Science Foundation of China under Grant works, 22(2):199–210, 2011.
61572009, Grant 61772276, and Grant 61602244, and the [Picard and Picard, 1997] Rosalind W Picard and Roalind
Jiangsu Provincial Key Research and Development Program Picard. Affective computing, volume 252. MIT press Cam-
under Grant BE2016616. bridge, 1997.
[Sangineto et al., 2014] Enver Sangineto, Gloria Zen, Elisa
References Ricci, and Nicu Sebe. We are not all equal: Personaliz-
[Britton et al., 2006] Jennifer C Britton, K Luan Phan, ing models for facial expression analysis with transductive
Stephan F Taylor, Robert C Welsh, Kent C Berridge, and parameter transfer. In Proceedings of the 22nd ACM inter-
I Liberzon. Neural correlates of social and nonsocial emo- national conference on Multimedia, pages 357–366. ACM,
tions: An fmri study. Neuroimage, 31(1):397–409, 2006. 2014.
[Collobert et al., 2006] Ronan Collobert, Fabian Sinz, Jason [Schölkopf et al., 1998] Bernhard Schölkopf, Alexander
Weston, and Léon Bottou. Large scale transductive svms. Smola, and Klaus-Robert Müller. Nonlinear compo-
Journal of Machine Learning Research, 7(Aug):1687– nent analysis as a kernel eigenvalue problem. Neural
1712, 2006. computation, 10(5):1299–1319, 1998.
[Etkin et al., 2011] Amit Etkin, Tobias Egner, and Raffael [Storbeck and Clore, 2005] Justin Storbeck and Gerald L
Kalisch. Emotional processing in anterior cingulate and Clore. With sadness comes accuracy; with happiness, false
medial prefrontal cortex. Trends in Cognitive Sciences, memory: Mood and the false memory effect. Psychologi-
15(2):85–93, 2011. cal Science, 16(10):785–791, 2005.
[Ganin et al., 2016] Yaroslav Ganin, Evgeniya Ustinova, [Suykens and Vandewalle, 1999] Johan AK Suykens and
Hana Ajakan, Pascal Germain, Hugo Larochelle, François Joos Vandewalle. Least squares support vector machine
Laviolette, Mario Marchand, and Victor Lempitsky. classifiers. Neural processing letters, 9(3):293–300, 1999.
Domain-adversarial training of neural networks. Journal [Thompson, 2005] Bruce Thompson. Canonical correlation
of Machine Learning Research, 17(59):1–35, 2016. analysis. Encyclopedia of statistics in behavioral science,
[Greve et al., 2013] Douglas N Greve, Lise Van der Haegen, 2005.
Qing Cai, Steven Stufflebeam, Mert R Sabuncu, Bruce Fis- [Zatorre et al., 1992] Robert J Zatorre, Marilyn Jones-
chl, and Marc Brysbaert. A surface-based analysis of lan- Gotman, Alan C Evans, and Ernst Meyer. Functional
guage lateralization and cortical asymmetry. Journal of localization and lateralization of human olfactory cortex.
Cognitive Neuroscience, 25(9):1477–1492, 2013. Nature, 360(6402):339–340, 1992.
[Izard, 1991] Carroll E Izard. The psychology of emotions. [Zheng and Lu, 2015] Wei-Long Zheng and Bao-Liang Lu.
Springer Science & Business Media, 1991. Investigating critical frequency bands and channels for
eeg-based emotion recognition with deep neural networks.
[Jenke et al., 2014] Robert Jenke, Angelika Peer, and Martin
IEEE Transactions on Autonomous Mental Development,
Buss. Feature extraction and selection for emotion recog- 7(3):162–175, 2015.
nition from eeg. IEEE Transactions on Affective Comput-
ing, 5(3):327–339, 2014. [Zheng and Lu, 2016] Wei-Long Zheng and Bao-Liang Lu.
Personalizing eeg-based affective models with transfer
[Kim et al., 2013] Min-Ki Kim, Miyoung Kim, Eunmi Oh, learning. In Proceedings of the Twenty-Fifth International
and Sung-Phil Kim. A review on the computational meth- Joint Conference on Artificial Intelligence, pages 2732–
ods for emotional state estimation from the human eeg. 2738. AAAI Press, 2016.
Computational and Mathematical Methods in Medicine,
2013, 2013. [Zheng, 2017] Wenming Zheng. Multichannel eeg-based
emotion recognition via group sparse canonical correla-
[Li et al., 2016] Yang Li, Wenming Zheng, Zhen Cui, and tion analysis. IEEE Transactions on Cognitive and Devel-
Xiaoyan Zhou. A novel graph regularized sparse linear opmental Systems, 9(3):281–290, 2017.
discriminant analysis model for eeg emotion recognition.
In International Conference on Neural Information Pro-
cessing, pages 175–182. Springer, 2016.
[Lotfi and Akbarzadeh-T, 2014] Ehsan Lotfi and M-R
Akbarzadeh-T. Practical emotional neural networks.
Neural Networks, 59:61–72, 2014.
[Musha et al., 1997] Toshimitsu Musha, Yuniko Terasaki,
Hasnine A Haque, and George A Ivamitsky. Feature ex-
traction from eegs associated with emotions. Artificial Life
and Robotics, 1(1):15–19, 1997.
1567