Automatic ECG-Based Emotion Recognition in Music Listening
Automatic ECG-Based Emotion Recognition in Music Listening
Automatic ECG-Based Emotion Recognition in Music Listening
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing
Abstract—This paper presents an automatic ECG-based emotion recognition algorithm for human emotion recognition. First,
we adopt a musical induction method to induce participants’ real emotional states and collect their ECG signals without any
deliberate laboratory setting. Afterward, we develop an automatic ECG-based emotion recognition algorithm to recognize
human emotions elicited by listening to music. Physiological ECG features extracted from the time-, and frequency-domain, and
nonlinear analyses of ECG signals are used to find emotion-relevant features and to correlate them with emotional states.
Subsequently, we develop a sequential forward floating selection-kernel-based class separability-based (SFFS-KBCS-based)
feature selection algorithm and utilize the generalized discriminant analysis (GDA) to effectively select significant ECG features
associated with emotions and to reduce the dimensions of the selected features, respectively. Positive/negative valence,
high/low arousal, and four types of emotions (joy, tension, sadness, and peacefulness) are recognized using least squares
support vector machine (LS-SVM) recognizers. The results show that the correct classification rates for positive/negative
valence, high/low arousal, and four types of emotion classification tasks are 82.78%, 72.91%, and 61.52%, respectively.
—————————— ——————————
1 INTRODUCTION
practical applications, and it may hinder subjects during meanings of the chosen words are culturally dependent [4].
daily life activities [14]. Therefore, it is important to de- Therefore, the discrete models require more than one word
velop a reliable emotion recognition system for an emo- to describe mixed emotions. In the affective dimensional
tion-based HCI that uses suitable physiological channels models, humans need to scale emotions in multiple dimen-
and shows acceptable recognition abilities and robustness sions for categorizing emotions. Recently, two common
against any artifacts caused by human movement or hu- scales used in emotion classification are valence and
man social masking. arousal [4], [14], [17], [18]. All emotions can be mapped
In this paper, we use only one ECG channel to classify onto the valance and arousal axes in the two-dimensional
four types of emotions by an emotion recognition system. (2D) emotion plane. For example, joy has positive valence
The novelty of this paper is the proposed automatic ECG- and high arousal, whereas sadness has negative valence
based emotion recognition algorithm. This algorithm con- and low arousal.
sists of a sequential forward floating selection-kernel-
based class separability-based (SFFS-KBCS-based) feature 2.2 ECG Signal and Emotion
selection algorithm, the generalized discriminant analysis From the literature review, we found that emotion is sys-
(GDA) feature reduction method, and least squares sup- tematically elicited by subjective feelings, physiological
port vector machine (LS-SVM) classifiers for effective arousal, motivational tendencies, cognitive processes, and
recognition of music-induced emotions by using physio- behavioral reactions. From the viewpoint of physiological
logical changes of only one ECG channel. For this purpose, arousal, it is difficult to find out an ANS differentiation of
we design an accurate experiment for collecting partici- emotions because the ANS may be influenced easily by
pants’ ECG signals when listening to music and then de- several factors such as attention, social interaction, ap-
velop an automatic emotion recognition algorithm using praisal, and orientation. However, recently reported stud-
only ECG signals for detecting R-waves, generating signif- ies have shown that ANS activity comprising the sympa-
icant emotion-relevant ECG features, and recognizing var- thetic nervous system (SNS) and parasympathetic nervous
ious human emotions effectively. After extracting ECG fea- system (PNS) is viewed as an important component of the
tures from the time-, frequency-domain, and nonlinear emotion response [19]. In addition, heart rate (HR) and
analyses of ECG signals, we select appropriate features heart rate variability (HRV), that are the variation over
from a total of 34 normalized features by using the SFFS- time of the period between successive heartbeats, are the
based search strategy combined with the KBCS-based se- common ECG features extracted from the ECG signals for
lection criterion. Subsequently, we utilize the GDA to ef- emotion recognition [2], [6], [10], [13], [14], [20], [21]. Re-
fectively reduce the dimension of the significant ECG fea- cent research studies have shown that music can actually
tures associated with emotions. Finally, we use the LS- produce specific physiological reactions of change in HR
SVM classifiers to recognize arousal-valence emotion and HRV that are associated with different emotions [4],
stages. [22], [23], [24], [25], [26].
This paper is organized as follows. Section 2 presents a
2.3 Boundary Conditions for Finding Relationship
brief overview of related research on automatic emotion
between Emotion and ECG Signals
recognition systems based on ECG signals when listening
to music. The experimental setup and protocol is presented This study focuses on the relationship between emotion and
in Section 3. Section 4 introduces the proposed automatic ECG signals. We summarize some possible factors that affect
ECG-based emotion recognition algorithm. Next, Sections the emotion classification results by using ECG signals as fol-
5 and 6 present the experimental results and discussion, lows:
respectively. Finally, conclusions are presented in the last 1. Not enough time for some participants to reach a neu-
chapter. tral state or emotional states elicited by the musical ex-
cerpts during the baseline and music listening stages.
2. The selected stimuli music does not have enough inten-
2 RELATED RESEARCH sity to elicit emotions for the participants in the emotion
classification tasks.
2.1 Dimensional Emotion Models 3. The difficulty of subject-independent classification is
It is difficult to judge or model human emotions because the intricate variety of nonemotional individual con-
people express their emotions differently based on such texts among the participants, rather than an individual
factors as their cognitive process and subjective feeling. ECG specificity in emotion [4].
Over the past several decades, many researchers have de- 4. The participants cannot faithfully report their emo-
voted to develop diverse emotion models for modeling hu- tional state on the GEMS-9 questionnaire because of the
man emotions [15], [16]. Among various emotion models, unconcentrated condition during music listening or the
the discrete and affective dimensional models are two misunderstanding of the meaning of the GEMS-9.
common approaches to model emotions, and they are not 5. There is no one-to-one relation between emotion and
exclusive of each other. In the discrete models, humans physiological changes in ECG-based features: Feeling
must choose a prescribed list of word labels to label emo- changes may occur without concomitant autonomic
tions in discrete categories for indicating their current emo- changes in ECG-based features and vice versa [19].
tion, for example, joy, tension, sadness, anger, fear, etc [10], The abovementioned factors directly or indirectly affect
[15]. However, the stimuli may elicit blended emotions the determinations of the ground truths for the classifiers,
that cannot be adequately expressed in words because the
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing
n 2
RMSSD = i 2
( Ri - Ri 1 ) / ( n 1), (2)
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing
2
pLF = (LF / TP) 100.
n
SDSD = i 2
( Ri - Ri 1 - RMSSD2 ) / ( n 1), (3) (11)
where 𝑅𝑖 is the ith RR interval, 𝑅̅ is the average of the RR pHF = (HF / TP) 100. (12)
intervals, and 𝑛 is the number of the RR intervals.
(2) HR related parameter: The number of R-waves within 4.5.3 Nonlinear Analysis
one epoch divided by 1 min (BPM). In the nonlinear analysis, a total of 9 features, including 2
(3) RR interval related parameters: The median value of RR ECG-derived respiration (EDR) related parameters, 3 Poin-
intervals (Median-RRI), interquartile range of RR intervals caré plot related parameters, 2 nonlinear dynamics related
(IQR-RRI), mean absolute deviation of RR intervals (MAD- parameters, and 2 autocorrelation related parameters, are
RRI), mean of the difference between adjacent RR intervals described as follows:
(Diff-RRI), coefficient of variation of RR intervals (CV-RRI), A. EDR related parameters
and difference between the maximum and the minimum Before we extract two EDR-related features (RSPrate and Co-
RR interval (Range). The relative equations for calculating herence), we should firstly obtain the EDR signal. The detailed
the RR interval related parameters are shown as follows. procedure for extracting the EDR signal can be referred from
1 n [35], [36]. Subsequently, we extract two features: respiratory
MAD_RRI = Ri - R ,
n i =1
(4) rate (RSPrate) and coherence between final EDR signal and
1 RR intervals (Coherence). To estimate the respiratory rate
n
Diff_RRI = R - Ri -1 ,
i =2 i
(5) (RSPrate), the final EDR signal is firstly subtracted by its mean
n-1
to remove the DC component. Then, the PSD analysis of the
1
Ri - R / n ,
n 2
CV_RRI = i =1
(6) EDR signal is applied to obtain the respiratory rate from the
R
frequency of the maximum peak in the low-frequency band
4.5.2 Frequency-domain Analysis (0.1–1 Hz) multiplied by 60 [35], [36]. In addition, according
In the frequency-domain analysis, a total of 13 HRV related to [37], the PSD of the EDR signal in the low-frequency band,
parameters are calculated at certain frequency bands. The 0–0.4 Hz, is similar to the PSD of the RR intervals when hu-
RR interval signal need to be resampled and interpolated mans have positive-valence emotion. Therefore, we can cal-
to transform them into a series of regularly resampled sig- culate the coherence between the final EDR signal and the RR
nals and to prevent the generation of additional harmonic intervals in the low-frequency band (0–0.4 Hz), which is a
components [34]. After resampling and interpolating, the function of the auto spectral density of the final EDR signal
power spectral density (PSD) of the resampled RR inter- (𝐺𝑥𝑥 ) and the RR intervals (𝐺𝑦𝑦 ), and the cross spectral density
vals is calculated by using the fast Fourier transform (FFT)- (𝐺𝑥𝑦 ) of the EDR signal and the RR intervals (𝐶𝑜ℎ𝑒𝑟𝑒𝑛𝑐𝑒 =
2
based method. The PSD analysis could be used to calculate |𝐺𝑥𝑦 | ⁄(𝐺𝑥𝑥 × 𝐺𝑦𝑦 )).
the power of specific frequency ranges and the peak fre- B. Poincaré plot related parameters
quencies for three different frequency bands: very-low-fre- The Poincaré plot analysis measures the quantitative beat-to-
quency range (VLF) (0.0033–0.04 Hz), low-frequency range beat correlation between adjacent RR intervals. According to
(LF) (0.04–0.15 Hz), and high-frequency range (HF) (0.15– the ellipse fitting process introduced in [38], [39], SD1 repre-
0.4 Hz). Concerning the frequency-domain analysis, we sents the standard deviation of the instantaneous beat-to-beat
calculated the following features: power calculated in the RR interval variability, SD2 represents the standard deviation
VLF, LF, and HF bands (VLF, LF, and HF), total power in of the continuous long-term beat-to-beat RR interval variabil-
the full frequency range (TP), ratio of power calculated ity, and SD12 is the ratio of SD1 to SD2.
within the LF band to that calculated within the HF band C. Nonlinear dynamics related parameters
(LF/HF), LF power normalized to the sum of the LF and HF In the nonlinear dynamics analysis, we extract two features
power (LFnorm), HF power normalized to the sum of the ApEn and SampleEn by calculating the approximate entropy
LF and HF power (HFnorm), VLF power expressed as per- and sample entropy, respectively. The approximate entropy
centage of the total power (pVLF), LF power expressed as and sample entropy are similar methods that quantify the
percentage of the total power (pLF), HF power expressed randomness or predictability of RR interval dynamics. They
as percentage of the total power (pHF), frequency of the are scale-invariant and model-independent; however, there is
highest peak in the VLF band (VLFfr), frequency of the some computational difference. They all assign a nonnegative
highest peak in the LF band (LFfr), and frequency of the number to a series of RR intervals that have larger values with
highest peak in the HF band (HFfr). The relative equations more complexity or irregularity in the data [32], [40].
for calculating the frequency-domain HRV related param- 1) ApEn: A parameter quantifies the amount of regularity
eters are summarized as follows. or predictability of the RR intervals. There are two user-
specified parameters in the ApEn measure: a run length
TP = VLF + LF + HF. (7) m and a tolerance window r (m and r used in this study
are 1 and 0.25 times the standard deviation of the RR
LFnorm = LF / (TP - VLF ) = LF / ( LF + HF ). (8) intervals). Initially, given N data points from a time se-
ries of RR intervals {rr(1), rr(2),…, rr(N)}. ApEn is com-
HFnorm = HF / (TP - VLF ) = HF / ( LF + HF ). (9) puted according to:
Step 1: Obtain a sequence of vectors rr(i) = [rr(i),
pVLF = (VLF / TP) 100. (10) rr(i+1),…, rr(i+m-1)], i = 1, 2,…, N-m+1. These vectors
represent m successive rr values, starting with the ith
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing
1
TABLE 1
ECG FEATURE SETS GENERATED IN THIS STUDY
0.9 (Lag time, Autocorrelation coefficient)
0.8
0.6
HR BPM
0.5 Time-domain analysis
Median-RRI, IQR-RRI,
0.4
RR interval MAD-RRI, Diff-RRI, CV-
0.3 RRI, Range
0.2 VLF, LF, HL, TP, LF/HF,
0.1
Frequency-domain analysis HRV LFnorm, HFnorm, pVLF, pLF,
pHF, VLFfr, LFfr, HFfr,
0
0 5 10 15 20 25
Lag time (second)
30 35 40 45 50
EDR RSPrate, Coherence
Poincaré plot SD1, SD2, SD12
Nonlinear analysis
Fig. 5 Autocorrelation of QRS complexes. (ACFcoef is 0.8982 and its Nonlinear dynamics ApEn, SampleEn
lag time is 0.69 s, then ACFfreq is the reciprocal of the lag time, 1.4492 Autocorrelation ACFcoef, ACFfreq
Hz.)
computational complexity and increase the classification sequential forward selection (SFS) and sequential backward
accuracy. selection (SBS) to reduce the nesting effect. The SFS is a bot-
In this paper, the proposed feature selection method com- tom-up feature selection method, whereas the SBS is a top-
bines a selection criterion with a search strategy for further se- down feature selection method. Suppose that we want to
lecting the appropriate features out of a total of 34 normalized choose a p-dimensional feature subset from the original fea-
features. This study develops an SFFS-KBCS-based feature se- ture set. SFS starts from an empty feature set and sequentially
lection algorithm that uses the SFFS method as the search adds one feature from the original feature set, thereby result-
strategy and the KBCS method as the selection criterion. ing in the best selection criterion. On the other hand, SBS
Herein, we describe these methods as follows. starts from the original feature set and sequentially deletes
one feature from the original feature set, thereby resulting in
4.7.1 KBCS-based Selection Criterion the best selection criterion. The SFFS process consists of three
The KBCS method utilized in this paper was originally devel- steps: inclusion, conditional exclusion, and continuation of
oped by Wang [41]. Let (𝐱, 𝑦) ∈ (ℝ𝑑 × 𝑌) represent a sample, conditional exclusion [42].
where ℝ𝑑 stands for a d-dimensional feature space, 𝑌 denotes First, suppose 𝐹𝑘 = {𝑓𝑖 : 1 ≤ 𝑖 ≤ 𝑘} is a selected feature set
the set of class labels, and the size of 𝑌 is the number of classes in which 𝑘 features have already been selected from the orig-
𝑐. This method projects the samples onto a kernel space 𝒦, inal feature set 𝑌 = {𝑦𝑖 : 1 ≤ 𝑖 ≤ 𝑛}, where n is the number of
𝜙
𝐦𝑖 is the mean vector for the ith class in the kernel space 𝒦, total features.
𝑛𝑖 is the number of samples in the ith class, 𝐦𝜙 is the mean Step 1: [Inclusion] Select feature 𝑓𝑘+1 by using the basis SFS
𝜙
vector for all classes in the kernel space 𝒦, 𝐒𝐵 denotes the be- method from the available set {𝑌 − 𝐹𝑘 } to form a feature set
𝜙
tween-class scatter matrix in the kernel space 𝒦, 𝐒𝑊 denotes 𝐹𝑘+1 , that is, the feature 𝑓𝑘+1 is the most significant feature
𝜙
the within-class scatter matrix in the kernel space 𝒦, and 𝐒 𝑇 in the available set {𝑌 − 𝐹𝑘 }, which is then added to 𝐹𝑘 .
denotes the total scatter matrix in the kernel space 𝒦. Let 𝜙(⋅) Therefore, 𝐹𝑘+1 = {𝐹𝑘 , 𝑓𝑘+1 }.
be a possibly nonlinear feature mapping from the feature Step 2: [Conditional exclusion] Find the least significant fea-
space ℝ𝑑 to a kernel space 𝒦: 𝜙: ℝ𝑑 → 𝒦, 𝐱 → 𝜙(𝐱), K de- ture 𝑓𝑗 in the feature set 𝐹𝑘+1 . If 𝑓𝑘+1 is the least significant
notes a kernel matrix with {𝐊}𝑖𝑗 = 𝑘(𝐱 𝑖 , 𝐱𝑗 ), where 𝑘(𝐱 𝑖 , 𝐱𝑗 ) is feature in the feature set 𝐹𝑘+1 , that is,
a kernel function, and 𝐊𝐴,𝐵 is a kernel matrix with the con-
straints of 𝐱 𝑖 ϵ 𝒜 and 𝐱𝑗 ϵ ℬ. The operator Sum(⋅) denotes the Jl ( Fk 1 - fk 1 ) = Jl ( Fk ) Jl ( Fk 1 - f j ), j = 1, 2, ..., k, (26)
summation of all elements in a matrix, and trace(A) is the
trace of a square matrix A. The followings are the relative 𝜙
where 𝐽𝑙 (∙) denotes the feature selection criterion function
equations: obtained from the KBCS. It means that the best feature
combination in 𝐹𝑘 is {𝐹𝑘+1 − 𝑓𝑘+1 }. Then, set 𝑘 = 𝑘 + 1 and
trace(SB ) = trace i =1 ni (mi - m )(mi - m )T
c
(22)
Then, exclude 𝑓𝑗 from the feature set 𝐹𝑘+1 to form a new
= trace(K D, D ) i =1 Sum(K Di , Di ) / ni ,
c
𝜙
where 𝐟𝑖𝑗 is the jth sample of the ith class, 𝐦𝑖 is the mean
vector for the ith class in the kernal space 𝒦 , 𝑛𝑖 is the
number of samples in the ith class, and 𝐦𝜙 is the mean vec-
tor for all classes in the kernel space 𝒦.
Step 2: Find transformation matrix 𝐓 ∈ 𝒦 by maximizing
the following equation in the kernel space 𝒦:
(38)
SW = i =1 j =1 ( (fij ) - mi )( (fij ) - mi )T ,
c n
(30) = l =1 j =1 lj k (flj , fi ).
i c n
l
based features as the input features for the following classifier. strategy is utilized if m classes need to be classified. If the out-
put of each LS-SVM classifier is one of the two classes, then
4.9 Classifier Construction the assigned class is increased by one vote. Finally, the classi-
An LS-SVM is a binary classifier that relies on a nonlinear fication result is represented as the class with the largest votes.
mapping of the training set to a higher-dimensional space, If two classes have an identical number of votes, we simply
wherein the transformed data is well-separated by a sepa- select the one with the smaller label, that is, if Joy and Peace-
rating hyperplane. Assume a training set of N data points, fulness, which are labeled as “1” and “4”, respectively, have
{𝐱 𝑖 , 𝑦𝑖 }𝑁
𝑖=1 , where 𝐱 𝑖 = [𝑥𝑖1 , … , 𝑥𝑖𝑞 ] ∈ ℝ is the ith input fea-
𝑞
the same number of votes, we predict that the output is Joy.
ture vector (q is the number of dimensions of the reduced In the one-against-one method for multiclass support vector
features obtained by the GDA) and 𝑦𝑖 ∈ {−1, 1} is the ith machines, if a given sample is classified into two classes
output label. The LS-SVMs are used to perform classifica- with the same number of votes, it is assigned into one class
tion tasks using the following decision functions: randomly [46] or the class with the smaller index [45]. Our
selection is based on the method proposed in [45]. In this
y( x) = sign i =1 i yi k ( x , x i ) b ,
N
(39) study, the output of the classifier is represented as the label of
the two types of valence (i.e., negative and positive valence
where 𝛼𝑖 are Lagrange multipliers (which are either are labeled as “1” and “2”), the two types of arousal (i.e., low
positive or negative) and 𝑏 is a real constant. Subsequently, and high arousal are labeled as “1” and “2”), and the four
we utilize the Gaussian RBF function for the kernel types of emotions (i.e., Joy, Tension, Sadness, and Peaceful-
function 𝑘 in this study, which is described as follows: ness are labeled as “1”, “2”, “3”, and “4”, respectively).
2
k ( xi , x j ) = exp(- xi - x j / 2 2 ), (40)
5 RESULTS
where the Gaussian width 𝜎 is the kernel parameter (𝜎 > The classification performances of the proposed automatic
0). The decision functions are constructed as follows: ECG-based emotion recognition algorithm were validated us-
ing a total number of 395 ECG samples collected from 61 par-
yi wT(xi ) b 1, i = 1, ..., N, (41) ticipants (including 105 for Joy, 55 for Tension, 40 for Sadness,
and 195 for Peacefulness). The ECG segments distributed in
where 𝜙(∙) is a nonlinear mapping function to map the in- positive/negative valence are 300 and 95, respectively, and
put space onto a higher dimensional space. However, 𝜙(∙) those distributed in high/low arousal are 160 and 235, respec-
is not explicitly constructed since the possibility of a sepa- tively. In addition, we compared the classification perfor-
rating hyperplane does not exist in the higher dimensional mances of the LS-SVM classifier between the proposed
space. Therefore, slack variables 𝜉 = (𝜉1 , … , 𝜉𝑁 ) introduced method and four other feature reduction methods, such as
as follows are utilized to solve this misclassification prob- SFFS-KBCS+GDA, SFFS-KBCS+PCA, SFFS-KBCS+LDA,
lem as follows: SFFS-KBCS+PCA+LDA, and SFFS-KBCS+PCA+GDA, once
the optimal dimensions of each of the feature reduction
yi w ( x i ) b 1 - i , i = 1, ..., N
T
SVM
KBCS+PCA+LDA+LS-SVM, and SFFS- 90
SFFS-K
KBCS+PCA+GDA+LS-SVM were 78.73%, 71.14%, 82.78%, 80
SFFS-K
SFFS-K
69.87%, and 79.49%, respectively. The classification perfor- SFFS-K
SFFS-K
mance comparisons of the proposed feature reduction
70
CCR(%)
LOSO, and LOO cross-validation strategies are summa- 50
rized in Table 2. Obviously, these results demonstrate that
the proposed SFFS-KBCS+GDA+LS-SVM scheme outper- 40
To obtain an optimal feature subset selected by the SFFS- Fig. 7 CCRs versus number of features selected by SFFS-KBCS with
KBCS-based feature selection method for the PCA, LDA, LS-SVM classifier between different feature reduction methods by
GDA, PCA+LDA, and PCA+GDA feature reduction meth- LOO cross-validation in the positive/negative valence classification
task. (Green: PCA. Orange: LDA. Red: GDA. Brown: PCA+LDA. Pink:
ods from the 34 features, we varied the number of features PCA+GDA.)
from 1 to 34. A comparison of the CCRs using different SVM
75
numbers of features through the different feature reduc- SFFS-KB
SFFS-KB
tion methods with the LS-SVM classifier verified by LOO 65
SFFS-KB
SFFS-KB
cross-validation is shown in Fig. 8; the best CCR was SFFS-KB
72.91% when 18 dimensions were selected by the SFFS- 55
KBCS for the GDA method. The overall CCRs of SFFS-
CCR(%)
45
and PCA+GDA feature reduction methods from the 34 fea-
40
tures. The LS-SVM classifiers between the aforementioned
35
five feature reduction methods were also verified by LOO
cross-validation for different numbers of features. A com- 30
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing
TABLE 3
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED FEATURE REDUCTION METHODS WITH LS-SVM CLASSIFIERS BY
CROSS-VALIDATION IN HIGH/LOW AROUSAL CLASSIFICATION TASK
TABLE 4
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED FEATURE REDUCTION METHODS WITH LS-SVM CLASSIFIERS BY
CROSS-VALIDATION IN FOUR EMOTIONS CLASSIFICATION TASK
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing
TABLE 6
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED CLASSIFICATION SCHEME W ITH SOME EXISTING SCHEMES FOR
POSITIVE/NEGATIVE VALENCE CLASSIFICATION TASK
Author Signals No. of Subjects Emotions Induction Method Classification Scheme CCR
TABLE 7
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED CLASSIFICATION SCHEME W ITH SOME EXISTING SCHEMES FOR
HIGH/LOW AROUSAL CLASSIFICATION TASK
Author Signals No. of Subjects Emotions Induction Method Classification Scheme CCR
TABLE 8
CLASSIFICATION PERFORMANCE COMPARISONS OF PROPOSED CLASSIFICATION SCHEME W ITH SOME EXISTING SCHEMES FOR
MULTI-EMOTION CLASSIFICATION TASK
Author Signals No. of Subjects Emotions Induction Method Classification Scheme CCR
Joy,
61.52%
Proposed method ECG 61 Tension, Sadness, Peace- Music SFFS-KBCS+GDA+LS-SVM
(4 emotions)
fulness
78.40 %
Sad, Anger, Stress, Sur- Multimodal (Audio, Visual and (3 emotions)
Kim et al. [2] ECG, ST, SC 50 SVM
prise Cognitive stimuli) 61.80 %
(4 emotions)
Fear, Anger, Sadness, Recall their personal emotional 65.30%
Rainville et al. [6] ECG, RSP 43 PCA+Heuristic decision tree
Happiness episode (4 emotions)
EMG, ECG, GSR, 62.70%
Rigas et al. [7] 9 Happiness, Disgust, Fear Picture viewing (IAPS) K-nearest neighbors
RSP (3 emotions)
ECG, SC, EMG, 69.70%
Kim and André [4] 3 Joy, Anger, Sad, Pleasure Music pLDA+EMDC
RSP (4 emotions)
Amusement, Anger, 74.00%
Wen et al. [10] OXY, GSR,ECG 101 Video Clips Random forests classifier
Grief, Fear, Baseline (5 emotions)
Positive & High arousal,
ECG, BVP, GSR, Negative & High arousal, 50.30%
Gu et al. [1] 28 Picture viewing (IAPS) K-nearest neighbors
EMG, RSP Positive & Low arousal, (4 emotions)
Negative & Low arousal
respectively, using only ECG signals when listening to music. scheme [8] is similar. In addition, the CCR of the proposed
In this section, we compared the existing classification SFFS-KBCS+GDA+LS-SVM scheme is approximately 20%
schemes with our proposed scheme for the positive/negative higher than that of the LDA+Gaussian naïve Bayes scheme
valence, high/low arousal, and multi-emotion classification with BVP, EOG, EMG, GSR, ST, and RSP biosignals [5].
tasks, respectively. The performance comparisons of our pro- Thus, the proposed SFFS-KBCS+GDA+LS-SVM scheme is
posed scheme and the existing schemes for the valence, appropriate for evaluating positive/negative valence emo-
arousal, and multi-emotion classification tasks are summa- tion classification performance by using only ECG signals.
rized in Tables 6, 7, and 8, respectively. According to Table 7, the CCRs obtained using the pro-
We used the proposed SFFS-KBCS+GDA+LS-SVM posed SFFS-KBCS+GDA+LS-SVM scheme for the
scheme to classify positive/neutral/negative valance us- high/low arousal classification task were 72.91% and
ing the ECG data collected from [8]; as shown in Table 6, 49.20% when the participants’ emotions were elicited by
the classification performance of the proposed SFFS- music and video clips, respectively. In addition, the overall
KBCS+GDA+LS-SVM scheme and the ANOVA+SVM CCR of the proposed SFFS-KBCS+GDA+LS-SVM scheme
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing
was better than that of the ANOVA+SVM scheme [8] and ACKNOWLEDGMENTS
the LDA+Gaussian naïve Bayes scheme [5] by more than
This work was supported by the Ministry of Science and
3.00% and 15.91%, respectively. Therefore, the results indi-
Technology of the Republic of China, Taiwan, under Grant
cate that the proposed SFFS-KBCS+GDA+LS-SVM scheme
No. MOST 106-3011-E-006-002 and MOST 106-2221-E-035-
is the best combination for the high/low arousal emotion
004.
classification task when using only ECG signals.
The performance comparisons of our proposed scheme
REFERENCES
and six existing schemes for the multi-emotion classifica-
tion task are summarized in Table 8. The results obviously [1] Y. Gu, S. L. Tan, K. J. Wong, M. H. R. Ho, and L. Qu, “A biometric
show that the performance of the proposed SFFS- signature based system for improved emotion recognition using
physiological responses from multiple subjects,” in Proc. of 8th
KBCS+GDA+LS-SVM scheme using only ECG signals is
IEEE Int’l Conf. Industrial Informatics, pp. 61-66, 2010.
similar to that of other schemes using multi biosignals.
[2] K. H. Kim, S. W. Bang, and S. R. Kim, “Emotion recognition sys-
This validates that the proposed SFFS-KBCS+GDA+LS-
tem using short-term monitoring of physiological signals,” Med-
SVM scheme can achieve performance similar to that of ical & Biological Engineering & Computing, vol. 42, pp. 419-427,
other existing schemes even when using fewer biosignals. 2004.
Furthermore, we find that the CCRs of the proposed SFFS- [3] D. Kulić and A. Croft, “Affective state estimation for human-ro-
KBCS+GDA+LS-SVM scheme deteriorated from 82.78%- bot interaction,” IEEE Trans. Robotics, vol. 23, no. 5, pp. 991-1000,
72.91% to 61.52% when the number of classified emotion 2007.
categories was increased from two to four. In other words, [4] J. Kim and E. André, “Emotion recognition based on physiolog-
the number of emotion categories is an influential factor ical changes in music listening,” IEEE Trans. Pattern Analysis and
that deteriorates the performance of the feature reduction Machine Intelligence, vol. 30, no. 12, pp. 2067-2083, 2008.
method and the LS-SVM classifier when the proposed [5] S. Koelstra, C. Mühl, M. Soleymani, J. S. Lee, A. Yazdani, T.
scheme uses only ECG signals. Ebrahimi, T. Pun, A. Nijholt, and I. Patras, “DEAP: A database
for emotion analysis using physiological signals,” IEEE Trans.
Affective computing, vol. 3, no. 1, pp. 18-31, 2012.
7 CONCLUSION [6] P. Rainville, A. Bechara, N. Naqvi, and A. R. Damasio, “Basic
In this paper, we present an automatic ECG-based emotion emotions are associated with distinct patterns of cardiorespira-
recognition algorithm consisting of the SFFS-KBCS+GDA tory activity,” International Journal of Psychophysiology, vol. 61, pp.
5-18, 2006.
feature reduction method and LS-SVM classifiers using
[7] G. Rigas, C. D. Katsis, G. Ganiatsas, and D. I. Fotiadis, “A user
only ECG signals for discriminating positive/negative va-
independent, biosignal based, emotion recognition method,” in
lence, high/low arousal, and four types of emotions (joy,
Proc. 11th Int’l conf. User Modeling, pp. 314-318, 2007.
tension, sadness, and peacefulness) elicited by listening to [8] M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic, “A multi-
music, respectively. A total of 34 features extracted from modal database for affect recognition and implicit tagging,”
the time-, frequency-domain, and nonlinear analyses of IEEE Trans. Affective computing, vol. 3, no. 1, pp. 42-55, 2012.
ECG signals to provide discriminative information for the [9] A. Kleinsmith and N. Bianchi-Berthouze, “Affective body expres-
emotion recognition tasks. Subsequently, the degrees of sion perception and recognition: A survey,” IEEE Trans. Attective
the physiological changes in the aforementioned ECG fea- Computing, vol. 4, no. 1, pp. 15-33, 2013.
tures between the conditions of the baseline stage and the [10] W. Wen, G. Liu, N. Cheng, J. Wei, P. Shangguan, and W. Huang,
music listening stage were obtained for the proposed SFFS- “Emotion recognition based on multi-variant correlation of
KBCS+GDA+LS-SVM classification scheme. The overall physiological signals,” IEEE Trans. Attective Computing, vol. 5, no.
CCRs of 82.78%, 72.91%, and 61.52% were obtained by the 2, pp. 126-140, 2014.
[11] K. Wac and C. Tsiourti, “Ambulatory assessment of affect: Sur-
LOO cross-validation strategy for valence, arousal, and
vey of sensor systems for monitoring of autonomic nervous sys-
four emotion classes using the ECG features with the pro-
tems activation in emotion,” IEEE Trans. Attective Computing, vol.
posed SFFS-KBCS+GDA+LS-SVM classification scheme to
5, no. 3, pp. 251-272, 2014.
classify a multi-subject database consisting of 395 strongly [12] R. Jenke, A. Peer, and M. Buss, “Feature extraction and selection
elicited affective samples from 61 participants. According for emotion recognition from EEG,” IEEE Trans. Attective Compu-
to the aforementioned experimental results, the effective- ting, vol. 5, no. 3, pp. 327-339, 2014.
ness of the proposed SFFS-KBCS+GDA+LS-SVM scheme [13] M. Kusserow, O. Amft, and G. Tröster, “Modeling arousal
has been validated successfully. Moreover, the CCRs are phases in daily living using wearable sensors,” IEEE Trans. At-
higher than or similar to those reported in the literatures tective Computing, vol. 4, no. 1, pp. 93-105, 2013.
reviewed in this paper when considering the different in- [14] F. Agrafioti, D. Hatzinakos, and A. K. Anderson, “ECG pattern
duction methods, emotion types, and number of subjects. analysis for emotion detection,” IEEE Trans. Attective Computing,
In conclusion, from the aforementioned experimental re- vol. 3, no. 1, pp. 102-115, 2012.
[15] P. Ekman, “An argument for basic emotions,” Cognition and Emo-
sults, we believe that the proposed automatic ECG-based
tion, vol. 6, pp. 169-200, 1992.
emotion recognition algorithm can be considered effective
[16] J. A. Russell, “A circumplex model of affect,” Journal of Personal-
for ECG-based emotion recognition tasks.
ity and Social Psychology, vol. 39, no. 6, pp. 1161-1178, 1980.
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing
[17] J. Posner, J. A. Russell, and B. S. Peterson, “The circumplex model variability: Standards of measurement, physiological interpreta-
of affect: An integrative approach to affective neuroscience, cog- tion and clinical use,” European Heart Journal, vol. 17, pp. 354–381,
nitive development, and psychopathology,” Development and 1996.
Psychopathology, vol. 17, no. 3, pp. 715-734, 2005. [34] J. P. Niskanen, M. P. Tarvainen, P. O. Ranta-aho, and P. A. Kar-
[18] M. D. van der Zwaag, J. H. Janssen, and J. H. D. M. Wesrerlink, jalainen, “Software for advanced HRV analysis,” Computer Meth-
“Directing physiology and mood through music: Validation of ods and Programs in Biomedicine, vol. 76, pp. 73-81, 2004.
an affectice music player,” IEEE Trans. Attective Computing, vol. [35] S. B. Park, Y. S. Noh, S. J. Park, and H. R. Yoon, “An improved
4, no. 1, pp. 57-68, 2013. algorithm for respiration signal extraction from electrocardio-
[19] S. D. Kreibig, “Autonomic nervous system activity in emotion: A gram measured by conductive textile electrodes using instanta-
review,” Biological Psychology, vol. 84, pp. 394-421, 2010. neous frequency estimation,” Medical & Biological Engineering &
[20] G. Valenza, A. Lanatà, and E. P. Scilingo, “The Role of nonlinear Computing, vol. 46, pp. 147-158, 2008.
dynamics in affective valence and arousal recognition,” IEEE [36] R. Bailón, L. Sörnmo, and P. Laguna, “A robust method for ECG-
Trans. Affective computing, vol. 3, no. 2, pp. 237-249, 2012. based estimation of the respiratory frequency during stress test-
[21] M. Nardelli, G. Valenza, A. Greco, A. Lanata, and E. P. Scilingo, ing,” IEEE Trans. Biomedical Engineering, vol. 53, no. 7, pp. 1273-
“Recognizing emotions induced by affective sounds through 1285, 2006.
heart rate variability,” IEEE Trans. Attective Computing, vol. 6, no. [37] W. A. Tiller, R. McCraty, and M. Atkinson, “Cardiac coherence:
4, pp. 385-394, 2015. A new, noninvasive measure of autonomic nervous system or-
[22] M. Orini, R. Bailón, R. Enk, S. Koelsch, L. Mainardi, and P.La- der,” Alternative Therapies, vol. 2, no. 1, pp. 52–65, 1996.
guna, “A method for continuously assessing the automatic re- [38] M. P. Tulppo, T. H. Mäkikallio, T. E. S. Takala, T. Seppänen, and
sponse to music-induced emotions through HRV analysis,” Med- H. V. Huikuri, “Quantitative beat-to-beat analysis of heart rate
ical & Biological Engineering & Computing, vol. 48, pp. 423-433, dynamics during exercise,” American Journal of Physiology-Heart
2010. and Circulatory Physiology, vol. 271, pp. 244-252, 1996.
[23] C. L. Krumhansl, “An exploratory study of musical emotions [39] G. D. Vito, S. D. R. Galloway, M. A. Nimmo, P. Maas, and J. J. V.
and psychophysiology,” Canadian Journal of Experimental Psychol- McMurray, “Effects of central sympathetic inhibition on heart
ogy, vol. 51, pp. 336-352, 1997. rate variability during steady-state exercise in healthy humans,”
[24] A. L. Roque, V. E. Valenti, H. L. Guida, M. F. Campos, A. Knap, Clinical Physiology and Functional Imaging, vol. 22, pp. 32-38, 2002.
L. C. M. Vanderlei, L. L. Ferreira, C. Ferreira, and L. C. de Abreu, [40] J. McNames and M. Aboy, “Reliability and accuracy of heart rate
“The effects of auditory stimulation with music on heart rate var- variability metrics versus ECG segment duration,” Medical & Bi-
iability in healthy women,” Clinics, vol. 68, no. 7, pp. 960-967, ological Engineering & Computing, vol. 44, pp. 747-756, 2006.
2013. [41] L. Wang, “Feature selection with kernel class separability,” IEEE
[25] M. Naji, M. Firoozabadi, and P. Azadfallah, “Classification of Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 9, pp.
music-induced emotions based on information fusion of fore- 1534-1546, 2008.
head biosignals and electrocardiogram,” Cognitive Computation, [42] P. Pudil, J. Novovičová, and J. Kittler, “Floating search methods
vol. 6, no. 2, pp. 241-252, 2014. in feature selection,” Pattern Recognition Letters, vol. 15, pp. 1119-
[26] F. M. Vanderlei, L. C. de Abreu, D. M. Carner, and V. E. Valenti, 1125, 1994.
“Symbolic analysis of heart rate variability during exposure to [43] G. Baudat and F. Anouar, “Generalized discriminant analysis us-
musical auditory stimulation,” Alternative Therapies in Health and ing a kernel approach,” Neural Computation, vol. 12, no. 10, pp.
Medicine, vol. 22, no. 2, pp. 24-31, 2016. 2385-2404, 2000.
[27] A. Gabrielsson and P. N. Juslin, “Emotion Expression in Music,” [44] J. A. K. Suykens and J. Vandewalle, “Least squares support vec-
Handbook of Affective Sciences, R. J. Davidson, K. R. Scherer, and tor machine classifiers,” Neural Processing Letters, vol. 9, pp. 293-
H. H. Goldsmith, eds., pp. 503-534, Oxford Univ. Press, 2003. 300, 1999.
[28] J. Pan and W. J. Tompkins, “A real-time QRS detection algo- [45] C. W. Hsu and C. J. Lin, “A comparison of methods for multiclass
rithm,” IEEE Trans. Biomedical Engineering, vol. 32, no. 3, pp. 230- support vector machines,” IEEE Trans. Neural Networks, vol. 13,
236, 1985. no. 2, pp.415-425, 2002.
[29] M. Mneimneh , E. Yaz , M. Johnson, and R. Povinelli, “An adap- [46] B. Liu, Z. Hao, and E. C. C. Tsang, “Nesting one-against-one al-
tive kalman filter for removing baseline wandering in ECG sig- gorithm based on SVMs for pattern classification,” IEEE Trans.
nals,” in Proc. Computers in Cardiology, pp.253-256, 2006. Neural Networks, vol. 19, no. 12, pp. 2044-2052, 2008.
[30] P. de Chazal, C. Heneghan, E. Sheridan, R. Reilly, P. Nolan, and
M. O’Malley, “Automated processing of the single-lead electro- Yu-Liang Hsu (M’17) received the B.S. degree in
Automatic Control Engineering from the Feng
cardiogram for the detection of obstructive sleep apnea,” IEEE Chia University, Taichung, Taiwan, in 2004, and
Trans. Biomedical Engineering, vol. 50, no. 6, pp. 686-696, 2003. the M.S. and Ph.D. degrees in Electrical Engi-
[31] J. S. Wang, W. C. Chiang, Y. L. Hsu, and Y. T. C. Yang, “ECG neering from National Cheng Kung University,
arrhythmia classification using a probabilistic neural network Tainan, Taiwan, in 2007 and 2011, respectively.
He is currently an Assistant Professor in the De-
with a feature reduction method,” Neurocomputing, vol. 116, pp. partment of Automatic Control Engineering, Feng
38-45, 2013. Chia University. His research interests include
[32] U. R. Acharya, K. P. Joseph, N. Kannathal, C. M. Lim, and J. S. computational intelligence, biomedical engineer-
Suri, “Heart rate variability: A review,” Medical and Biological En- ing, nonlinear system identification, and wearable intelligent technol-
ogy.
gineering and Computing, vol. 44, pp. 1031-1051, 2006.
[33] Task Force of the European Society of Cardiology and North
American Society of Pacing and Electrophysiology, “Heart rate
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/TAFFC.2017.2781732, IEEE Transactions on Affective Computing
1949-3045 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more
information.