0% found this document useful (0 votes)
4 views

06_chapter 2

Uploaded by

hinami2024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

06_chapter 2

Uploaded by

hinami2024
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Chapter 2

Literature Survey

This chapter encompasses an overview of contemporary research. It also depicted all

those related to emotion recognition such as the dataset available, different equip-

ment related to the 10-20 probe system along with different research carried out in

the last decade. The description further extended to feature extraction methods, sig-

nal processing and classification algorithms. This chapter narrated the methodology

that is being used by researchers and their corresponding prediction performances.

2.1 Introduction

Emotion is a very complex subject so is classification. Different types of emotion

have been proposed by psychological researchers but data scientists simplified these

emotions and limited them to a few emotions while the classification of emotions

from EEG signals. Some researchers have classified emotion into three classes such

as anger, surprise and others. Anger defines as any emotion other than the first two

emotions. The analysis of EEG signals concerning emotion has been based on tem-

24
Literature Survey

poral and spatial domains. EEG signal generally consists of five frequency bands

such as theta, alpha, beta and gamma. A discussion on the source of EEG sig-

nals, their characteristics and their uses in emotion recognition have been discussed

in this chapter. The harnessing of EEG signals and their pre-processing has been

described. The EEG signals are the noncontrolled signal in the sense that human

beings are not capable of manipulating the EEG signals. A proper objective is set

before any experiment as to which emotion to elicitated in the subject. Different

audio/video clips are designed by international affecting computing to evoke differ-

ent emotions. The same audio/video clip is unable to produce the same emotion in

subjects of different regions of the world. According to emotions the audio/video

clips and their duration is set. The 10-20 methods and the different probes used

by different researchers have been discussed in this chapter. The noise-free EEG

signals are always preferred to efficient emotion recognition. The preprocessing of

EEG signals is an important topic of discussion. The feature selection and its sta-

tistical tool are also important for efficient emotion recognition. There is no direct

relation between EEG signals and a manual score of emotion since EEG signals are

involuntary and a manual score is voluntary. A special selection of EEG signals and

segmentation of EEG signals are often necessary to correlate the manual score of

emotion. Sometimes special signal processing tools are used to enhance the relation

(linear or nonlinear) between EEG signals and manual scores.

In literature, different statistical classifiers have been explored for the efficient

classification of emotion. SVM has been a prominent classifier for emotion recog-

25
Literature Survey

nition. However other classifiers have also been used for emotion recognition. It is

observed from the above-related studies that feature is extracted in temporal and

in the frequency domain. Among the frequency domain features Fourier Transform

or short-time Fourier transform have been widely used. Time-domain features in-

clude mainly Empirical mode of Decomposition (EMD) and wavelet transform in

different formats. Among the classifier, SVM (Wang et al., 2011),(Hosseini and

Naghibi-Sistani, 2011),(Duan et al., 2013) is the dominant classifier in two-class and

multiclass format according to classification. Other classifiers such as K-Nearest

Neighbor (KNN)(Mohammadi et al., 2017) and Naı̈ve Bayes (NB) (Huang et al.,

2012) have also been used in several emotion classifications. Though several re-

searchers have explored different techniques for the classification task, in each case,

the selection of features is crucial.

Between 2009 and 2016, the authors Alarcao and Fonseca (2017), reviewed a

significant number of highly popular emotional research papers and found as many

as 42 distinct feature selection approaches have been explored. The authors further

revealed that different forms of Fourier transform such as Short-time Fourier Trans-

form (STFT), Discrete Fourier Transform (DFT), and Wavelet Transform have been

the most popular feature estimation techniques besides statistical Power Spectral

Density (PSD).

Of late, a different variant of the neural network along with different activation

functions has been in use to enhance emotional classification. Information enhance-

26
Literature Survey

ment techniques are also reported for the efficient classification of emotions. Exten-

sive research is being done on emotion recognition. The uniformity in results has

not been reported so far. Even there is no uniformity in reporting of research. The

prediction accuracies are reported with a different metric. Brouwer et al. (2015)

have recommended six ways to bring uniformity in the research of emotion from

physiological signals. These recommendations are further elaborated by Alarcao

and Fonseca (2017) and the six major approaches are summarized in Figure 2.1.

Figure 2.1: Research recommendations for Physiological Signal Vs. Emotion


(Brouwer et al., 2015)

2.2 Electrical Activity of the Brain

The somatosensory system is a part of the human sensory system which evokes

sensation in the body due to touch, environment temperature, pressure, fear or

27
Literature Survey

pleasure. This sensory feeling is transported to the brain through the spinal cord or

other nervous systems for proper explanation and reactions. Event-Related Potential

(ERP) (Sur and Sinha, 2009) are the electrical activity of the brain which becomes

active during special event or stimuli. The early waves of ERPs are lasted for 100 mil-

liseconds after the appearance of stimuli and are termed exogenously which depends

on the parameter of stimuli and the later part of the wave is termed endogenous and

depends on the subject’s evaluation of stimuli. Event-Related desynchronization

(ERD) (Pfurtscheller, 1991) is related to the attenuation of alpha waves for a short

duration and Event-Related Synchronization (ERS) is the enhancement of alpha

waves for a short duration. During visual stimuli, ERD is found in the occipital

zone and ERS over the central part of the brain close to C3 and C4.

The evoked potential is electrical activity resulting from different stimuli. Evoke

potentials are recorded in the form of EEG signals. Evoke potential is divided into

four categories: (1) visual evoked potential (2) auditory evoked potential (3) motor

evoked potential and (4) somatosensory evoked potential. Amygdala is the main

part of the emotional system of the human body. It gives instantaneous responses

to the stimuli in the form of emotion. It identifies the facial expression of a friend

or foe. So the affective system is also associated with Amygdala. Thalamus is an-

other part of the limbic system which process emotion such as happiness, pleasure,

fear, sadness and disgust. It also processes all sensory inputs except olfactory in-

formation. Another important part of the cognitive process is Ventral Tegmentum

(VTA). It is located in the central part of the brain and is responsible for the im-

28
Literature Survey

pulsive action of the human being. It also processes the information from Amygdala.

2.3 Probe Location of 10-20 System

The 10 − 20 system is an internationally standardised electrode placement system of

the skull of the human brain. 10 − 20 means 10% or 20% distance of nasion to inion

length. It is the distance by which all probes are located from neighbouring probes.

In this system whole skull is divided into four regions such as frontal, temporal,

parietal, and occipital. A line between nasion to inion divides the skull into two

regions. The left side of the skull suffix with an odd number and the right side with

an even number. All probes on the line joining between nasion to inion are marked

with a suffix zero or z. There are many recording machines with skull caps available

on market. Biosem active two (Khalili and Moradi, 2009) is a 280 channels DC

amplifier with 24-bit resolution coupled with four different sampling frequencies.

Some of the EEG recording equipment and channels with sampling frequency

have been presented in Table 2.1. Li and Lu (2009) used an Ag/Agcl 62-channel

EEG cap for the recording of EEG signals of ten subjects at a sampling rate of 1Khz.

Further to this, they explore features with a statistical tool such as Common Spatial

patterns (CSP). Lin et al. (2009) have harvested EEG signals using the Neuroscan

module with 32 channels. Bhatti et al. (2016) recorded EEG signals using a Neu-

rosky headset while listening to music. They have recorded four different emotions

like happy, sad, love and anger emotions in response to audio music.

29
Literature Survey

Table 2.1: Some of the EEG recording equipment and channels with the sampling
frequency

Author Equipment Freq- No. of 10-20 probes


uency channels
Li and Lu (2009) EEG Cap 1 KHz 62 NA

Khalili and Moradi (2009) Biosemi 2048 Hz 54 NA


active
two
Lin et al. (2009) EEG 500 Hz 12 FP1-FP2,F7-F8,
Neuroscan F3-F4,FT7-FT8,
FC3-FC4,T7-T8,
P7-P8,C3-C4,
TP7-TP8,
CP3-CP4,
P3-P4,O1-O2.
Bhatti et al. (2016) Neurosky 512 Hz 01 Fp1

2.4 Stimulus of Emotion

Emotion elicitation or emotion invoke in the brain due to some stimulus and these

stimuli could be any object like audio, visuals or touch. Audio and videos have

been used for a long to invoke emotion in the human mind. The reliability of data

depends on the stimulus which elicits emotion. The other parameter of the reliabil-

ity of data is the number of subjects, the gender of subjects, their median age and

finally the duration of the signal recording. International Affective Picture System

(IAPS) is one of the databases designed to elicit different emotions in the human

30
Literature Survey

mind.

Yazdani et al. (2009) have used stimuli as video clips from ’YouTube’ to evoke

six basic emotions defined by Paul Ekman. The emotions are joy, sadness, surprise,

disgust, fear and anger. The mean, minimum and maximum duration of video clips

were 58 seconds, 15 seconds and 161 seconds.

Soleymani et al. (2015) have studied facial expression and EEG signals and re-

ported successful extraction of emotion from facial activities. The EEG signals are

involuntary output of human activities and it cannot be modified by the subject.

So, the information contents of EEG signal are actual and more genuine manifesta-

tion of human physiological information, including emotions. Facial expressions can

be different from the actual emotion experienced by the subject and she/he may

modify the expression to a large extent.

Kroupi et al. (2014) have studied the effects of the olfactory bulb in the elicita-

tion of emotion based on audio, video and odour, in which the Wasserstein distance

metric has been used for estimation of the power difference between trial and base-

lines as the classifier. However, no hypothesis is proposed with a link to the class

or measure of emotion. Chanel et al. (2011) have worked on emotion while playing

games at various difficulty levels. In his case features are theta (θ), alpha (α) and

beta (β) of the EEG signal and some other physiological signals. Some of the lists

of stimulus, duration, number of subjects and emotion recorded have been depicted

in Table 2.2.

31
Literature Survey

Table 2.2: Some of the list of stimulus, duration, number of subjects and emotion
recorded

Author Stimulus Duration Number Emotions


of
subjects
(M\F)
Li and Lu (2009) IAPS 2.5 sec 5(0\5) Calm, Positively
excited &
negatively excited.
Lin et al. (2009) Music 30 sec 26(-\-) Joy, Angry
Sadness &
pleasure; valence.
Sadness &
pleasure; valence.
arousal.
Nie et al. (2011) video 4 min 6(3\3) Positive &
negative
Lan et al. (2016) IADS 76 sec 5(1\4) Pleasant happy,
angry& frightened.
Kroupi et al. (2014) Odours 8 sec 25(9\16) pleasantness

Pan et al. (2016) Images 8 sec 6(-\-) Happiness &


sadness

It is clear from this particular section that the wide-ranging stimulus has been

applied to evoke emotion. The stimulus is from YouTube videos to facial expres-

sions besides EEG signals. So it is difficult to standardise the stimulus and so is the

classification of emotion.

32
Literature Survey

2.5 Dataset Available for Research

There are several Datasets publicly available for research as mentioned in Table 2.3.

These datasets are widely varied in terms of the number of subjects. Generally, the

number of subjects in each dataset is less and different. The stimuli are more or

less different. It is further observed that the length of the dataset, sampling rate

and pre-processing are varied. Two prominent datasets have been discussed in great

detail in this section. These are the SEED and DEAP datasets.

SEED dataset: the SEED data set is recorded from fifteen Chinese nationals of an

average age is 23.27 the standard deviation is ±2.37. The group consist of seven

males and eight females. The SEED data set is recorded based on fifteen Chinese

film clips. The actual duration of the clip is for four minutes. Before the actual

exposure to the film clip, the subject is given five minutes for preparation. After the

actual exposure of four minutes, the subjects are given 45 seconds for self-assessment

and thereafter 5 seconds of rest.

Further, the data is stored in two folders named pre-processed and post-processed

data. The file names are ‘Preprocessed EEG’ and ‘Extracted Features’. Prepro-

cessed data is segmented and downsampled to 200Hz before storing it in the ’Pre-

processed EEG’ folder. Data is recorded three times for each subject with a gap

of one week between two recordings. There are 45 Matlab files, one for each ex-

periment. In the ‘extracted Features’ folder, different features are stored. These

features are differential entropy (DE), differential asymmetry (DASM) and rational

asymmetry (RASM). The feature signals are further smoothened by moving average

33
Literature Survey

Table 2.3: Some of the Emotional Database Publicly available for research

Author Data set No. of Emotion No. of


channels subjects
Zheng and Lu (2017) SEED 62 Film clips 15
Koelstra et al. (2011) DEAP 40 Music video 32
Martin et al. (2006) eNTERFACE 54 IAPS 05
Li et al. (2020) SEED IV 62 72 films 15
Soleymani et al. (2011) Mahnob 32 movie & 30
HCL picture
Cattan et al. (2018) EEG 16 eyes 20
Alpha wave open\close
dataset

filter and linear dynamic system (LDS) approaches.

DEAP data set: DEAP data set is a multimodal data set containing data from 32

subjects. There are 32 EEG channels and 8 ancillary channels against 40 videos

that have been recorded for each subject. Some of the emotional database available

in the public domain is presented in Table 2.3.

It is observed from the above presentation that the data sets publically available

are fewer in number and not uniform in terms of video and duration of a video

shown and the number of channels used to record the EEG signals. The emotional

marking is also on different scales. For example, the DEAP dataset applied a 1 to 9

continuous scale to express emotion levels whereas the SEED dataset used plus one

for positive emotion, zero for neutral emotion and minus one for negative emotion.

Further, their average age and their cultural identities are also not the same. Since

cultural differences have a great influence on the perception of emotion. Finally, the

34
Literature Survey

prediction result varies in these datasets when prediction is done under the same

model. For example, Gupta et al. (2018) reported the prediction result of 59 percent

and 83 percent while working with the DEAP dataset and SEED dataset.

2.6 Feature selection

Electroencephalography is the process of harvesting physiological signals from the

scalp of the brain. It is in the range of micro to millivolt. It is collected through an

electrical probe from the different zone of the scalp defined as a 10-20 system. The

frequency of the EEG signal occupies in the ranges of 0.1Hz to around 100 Hz. EEG

signal is of great importance to diagnose brain disorders such as epilepsy etc. The

EEG signal is consist of five spectral bands delta, theta, alpha, beta and gamma.

The five widely accepted EEG frequency bands are delta (0 to 4 Hz), theta (4 to 8

Hz), alpha (8 to 13 Hz), beta (13 to 30 Hz) and gamma (> 30Hz). Many researchers

have used one or more spectral bands as a feature for the prediction of emotion.

The first objective of classification problems is to select a proper feature and sta-

tistical tool for its extraction. The feature could be spatial or temporal. In emotion

recognition, input is the EEG signals or in some cases other physiological signals

but it is important to consider a proper feature of physiological signals. Feature

selection is important in many aspects. The proper feature enhances the classifi-

cation accuracy. The reduction of features simplifies the complexity of the model

and processing time. It also reduces the redundancy of features. The physiological

signals are generally associated with noise from other systems of the human body.

35
Literature Survey

It is required to find a relationship or dependencies with features and the output as

a response. The dependencies could be linear or nonlinear. Feature selection aims

to reduce dimensionality and redundancy. An alpha band that is in the range of 8 to

13Hz has a direct relation with Thalamus. The Thalamus generates and modulates

the alpha wave of EEG signals (Lindgren et al., 1999), (Schürmann and Başar, 2001).

It is evident from contemporary research that a large number of researchers use

a temporal domain such as Empirical Mode Decomposition (EMD). Wavelet Trans-

form (WT) with different kernels have also been extensively used for feature extrac-

tion in the time-frequency domain. The Short-time Fourier transform (STFT), and

Fast Fourier transform (FFT) are also reported for the feature extraction process.

In (Hadjidimitriou and Hadjileontiadis, 2012) features are extracted from the

beta and gamma bands of EEG signals using a spectrogram, Zhao-Atlas-Marks dis-

tribution, and Hilbert-hung spectrum (HHS), and emotion recognition is done using

SVM, kNN, quadratic, and linear While the study on EEG signal emotion recog-

nition is commendable, it is unclear whether the high accuracy of 86.52 (±0.76)

percent is for like or dislike of music.

In an interesting research work where the Hjorth parameter is used. Using Hjorth

parameters, Velchev et al. (2016) examined the effectiveness of sub-bands such as

theta, alpha, beta, and gamma. These parameters were utilised by the Support

Vector Machine to classify emotions. Out of three emotional levels, the arousal of

36
Literature Survey

top-class emotions has an accuracy of 80%. In contrast, the research publication is

equivocal about channel selection for emotion extraction and overall accuracy.

In the time domain, Empirical Mode Decomposition (EMD) method has widely

been applied. N Zhuang et al. (Zhuang et al., 2017) decomposed the EEG signal

into finer waves called Intrinsic Mode Frequency (IMF) and then used Multidimen-

sional information of IMF as features, the first difference of time series, the first

difference of phase, and the normalised energy have been used to classify emotion

by SVM. IMF1 is thought to play a role in emotion perception, according to the

researchers. The author reported accuracy was 69.10 percent for valence, and 71.99

percent for arousal. In the EMD method, IMFs are not correctly evaluated if noise

is associated with the signal.

The variant of Wavelet Transform has also been observed in emotion recognition

research. Gupta et al. (2018) ) proposed that EEG signals be decomposed into nu-

merous sub-bands using the Flexible Analytic Wavelet Transform (FAWT). For the

categorization of emotion from information potential, Random Forest and SVM al-

gorithms were applied, with an accuracy of 59 percent and 83 percent for the DEAP

database and SEED database, respectively. The reasons for such huge disparities in

accuracy, however, have not been explained.

In another interesting work where a large number of features have been explored.

Chanel et al. (2009) applied Short Term Fourier Transform (STFT) with 512 samples

37
Literature Survey

Table 2.4: Some of the feature and their statistical tools for extration

Author EEG Feature Feature extraction tool


Chanel et al. (2009) Delta, theta, alpha, STFT and MI
beta & gamma (between pairs
of electrodes)
Khalili and Moradi (2009) Theta, alpha, Statistical & GP
beta, & gamma
Murugappan et al. (2009) Alpha WT (db4)
Koelstra et al. (2010) Fixed bandwidths PSD, and CSP
from 1 to 10 Hz with
50% band overlap
Petrantonakis and Hadjileontiadis (2010) Alpha & beta HOC
Chanel et al. (2011) Theta, alpha, Statistical
and beta
Nie et al. (2011) Alpha, beta, FFT
and gamma
Wang et al. (2011) Alpha, beta MRMRM
and gamma
Duan et al. (2012) Delta, theta, alpha, STFT, PSD
beta & gamma DASM & RASM
Hosseini and Naghibi-Sistani (2011) Delta, theta, STFT, PSD
alpha & beta AE, & WE

and with 50% overlap between two consecutive windows to extract features. They

have used 16,704 (64 electrodes ×9 frequency band × 29-time frames) features in

one set. In another set of features, they have used mutual information (MI) between

pairs of electrodes between different areas of the brain. However, it is not clear how

MI between two electrodes would help in emotion recognition.

The application of selective spectral bands has been put into research by several

researchers as mentioned in the following paragraphs. Khalili and Moradi (2009) ,

have used the four spectral bands viz. theta, alpha, beta and gamma. They also

experimented with other physiological signals such as Galvanic skin resistance, res-

38
Literature Survey

piration, blood pressure and temperature. They found that EEG signal is more

effective than peripheral signals. Murugappan et al. (Murugappan et al., 2009)

carried out research based on alpha (α) band frequency and achieved an average

accuracy of 66.66% and 66.67% from a stimulus such as visual and audio-visual.

They used the db4 wavelet function(WT) to decompose the frequency of EEG sig-

nal and neural network for classification. Koelstra S et al. (Koelstra et al., 2010)

have recorded EEG and other peripheral signals in their laboratory however, they

used Power spectral density (PSD) and Common Spatial Patterns (CSP) as features.

They also reported that PSD analysis with a bandwidth of 1-10Hz and with 50%

band overlap investigates the rhythmic variations of brainwaves. They further used

CSP to classify the signal according to its variance. Petrantonakis et al. (Petran-

tonakis and Hadjileontiadis, 2010) have explored feature extraction with the help

of Higher-Order Crossings (HOC) besides Empirical Mode decomposition (EMD)

and cross-correlation. Chanel G et al.(Chanel et al., 2011) have extracted energy

as features of spectral bands such as theta, alpha and beta of EEG using the Fast

Fourier Transform (FFT) method.

Similarly, Mikhail et al. (Mikhail et al., 2013) have useed alpha (α) band fre-

quency while predicting four different emotion. The accuracies achieved by them are

51%, 53%, 58% and 61% for joy, anger, fear and sadness respectively. They further

experimented with reduced channel and achieve accuracies of 33%, 38%, 33% and

37.5% for joy, anger, fear and sadness respectively. The beta (β) spectral band of

the EEG signal is normally related to the relaxing but focused state of a subject.

39
Literature Survey

In the frequency between 13 and 30 Hz, beta rhythms are normally found in the

frontal and central regions compared to posterior regions.

The fine brainwave state is the delta. Here the brainwaves are of the greatest

amplitude and slowest frequency. They typically centre around a range of 1.5 to

4 cycles per second. They never go down to zero because that would mean that

person is brain-dead. But, deep dreamless sleep would take human beings down to

the lowest frequency. Typically, 2 to 3 cycles a second.

When adults go to bed and read for a few minutes before attempting sleep, adults

are likely to be in low beta. When adults put the book down, turn off the lights

and close their eyes, the brainwaves will descend from beta to alpha, to theta and

finally, when adults fall asleep, to the delta.

Zhuang N et al. (Zhuang et al., 2017) experimented with beta (16-32 Hz) and

gamma (32-64 Hz) components of EEG signals as a feature. They also exhibited

few merit and demerit of the Empirical Mode of Decomposition (EMD). A compre-

hensive table of features and feature extraction tools has been presented in Table 2.4.

2.7 Signal Processing

There are cases where the direct features are not applied for classification. Few

researchers have modified or selected features carefully so that the classification be-

40
Literature Survey

comes efficient. Petrantonakis and Hadjileontiadis (2010) described how they used

the asymmetry index to segment areas before using Empirical Mode Decomposition

(EMD) to extract features. For classifying valence and arousal emotions, they used

higher-order crossings, cross-correlation as a feature extraction tool, and SVM as a

classifier.

For channel selection out of all EEG channels, Wang et al. (2019) used normal-

ized mutual information. They also used a short-time Fourier transform to measure

the EEG spectrogram (STFT). EEG signals are converted from lower to higher

dimensional space using a Gaussian kernel in another fascinating study by Garcı́a

et al. (2016). The Gaussian process is used to extract the feature, which is quite

similar to deconvolution or decompression. Latent space is employed in this method

to find better features that correlate better with the emotional score. To reach the

result, classification is performed in a higher-dimensional space, with the findings

projected back to a lower-dimensional space.

The three structural components of emotion categorization from EEG data that

have gained the most attention in research are feature extraction, signal processing,

and classification. Several studies (Piho and Tjahjadi, 2018), (Wang et al., 2019)

look at the emotional content of physiological signals, which is crucial for emotion

classification. Because all of the features are unnecessary for emotion recognition,

Atkinson and Campos (2016) employed Mutual Information to choose the impor-

tant features while discarding the redundant ones. The fundamental criterion for

41
Literature Survey

selecting features and classes is mutual information. Using an 8-fold cross-validation

method, the training and testing were carried out. The classification is done using

an SVM classifier with RBF and a polynomial kernel. To choose channels from a

large number of EEG channels, Wang et al. (2019) employed normalized mutual

information. The EEG spectrogram was also assessed using a short-time Fourier

transform (STFT).

As mentioned above, Mutual information is a criterion for the selection of signal

while using DEAP and MAHNOB-HCI data sets (Piho and Tjahjadi, 2018).The

amount of useful data has been selected based on MI between data and emotional

labelling. The signal segments with the maximum information have been used for

feature extraction. Five frequency bands have been investigated so far. On the

other hand, the authors have not looked at any optimization methods. The present

work takes a cue from this particular research and applied the different algorithms

to increase mutual information between features and a manual score of emotion.

2.8 Research work on Channel Reduction

Intuitively, it is considered that all the EEG channels do contribute to emotion

recognition. There are several EEG channel reduction processes have been reported

recently. Srinivas et al. (Nadipalli et al., 2014) have used wavelet transform to

extract features from Gama, Beta, Alpha, Theta and Delta bands. They have

used the Radial Basis Function and Multilayer perception model for classification.

42
Literature Survey

Further, they propose that Occipital lobe channels such as Oz, O1, and O2 give

better accuracies out of all EEG channels. They also stated that wavelet transform

gives better performance than Fourier transforms in frequency segmentation results.

Wang et al. (Wang et al., 2019) used Normalized Mutual Information (NMI) to re-

duce channel redundancy and hardware complexity and also sliced the channel of

the DEAP dataset. Short-time Fourier transforms were used to capture the EEG

spectrogram. Finally, classification is done by a widely used classifier SVM. Atkin-

son et al.(Atkinson and Campos, 2016) selected relevant channels with the help of

the minimum Redundancy Maximum Relevance method (mRMR). They segregate

features based on correlation to reduce information redundancy.

2.9 Traditional Classification Techniques

SVM (Liu and Sourina, 2012), (Duan et al., 2013),(Jie et al., 2014), (Jiang et al.,

2016) has been a classifier for emotion recognition for a long time. Different kernels

of SVM have been used for the classification of emotion. The second most used clas-

sifier is the random forest. The k-Nearest Neighbour has largely been used in the

classification of emotion. Different kernels for SVM such as the Radial basis func-

tion (Ali et al., 2016), (Alsolamy and Fattouh, 2016), (Atkinson and Campos, 2016),

linear function (Li and Lu, 2009), (Koelstra et al., 2010), (Nie et al., 2011), (Wang

et al., 2014), polynomial function (Lan et al., 2016), (Liu et al., 2013) and Gaussian

(Liu et al., 2016a), (Jatupaiboon et al., 2013) have been used widely for classifi-

cation. Linear Discriminant Analysis (LDA) (Chanel et al., 2011),(Makeig et al.,

43
Literature Survey

2011), (Stikic et al., 2014) and Quadratic Discriminant Analysis (QDA) (Khalili and

Moradi, 2009), (Lee and Hsieh, 2014) have also been used by a few researchers to

recognize emotions.

Wang et al. (Wang et al., 2011) have used alpha, beta and gamma bands to

extract emotion with the help of a classifier such as KNN, Multilayer perception

and SVM. Haiyan Xu et al. (Xu and Plataniotis, 2012) have made use of alpha

and beta spectral bands of EEG signals as features using statistical, narrow-band

and higher Order Crossings and wavelet entropy while classification is done with the

K-nearest Neighbor (KNN). They achieve a prediction of 90.77% in three emotion

classes. Several other researchers have explored emotion recognition using k-NN

(Hatamikia et al., 2014), (Hadjidimitriou and Hadjileontiadis, 2012), (Bastos-Filho

et al., 2012).

Variant of SVM such as Multiclass SVM and Fuzzy SVM has also been applied for

emotion detection. Yuan-Pin Lin (Lin et al., 2009), used five spectral bands of EEG

signals viz. delta (1-3 Hz), theta (4-7 Hz), alpha (8-13 Hz), beta (14-30 Hz) and

gamma (31-50 Hz). They applied multiclass SVM for the classification of emotion.

Using Shannon Entropy and a higher-order auto-regressive model, Vijayan D et. al.

(Vijayan et al., 2015) retrieved characteristics from EEG signals from the DEAP

database. A multi-class Support Vector Machine is used to classify emotions (MC-

SVM). Exciting, happy, sad, and hatred are the four different types of emotions. The

visual foundation for selecting 12 EEG channels over the gamma band for emotion

extraction, on the other hand, is heuristic. In his Ph. D. dissertation, (Chatchinarat

44
Literature Survey

et al., 2017) described emotion identification using a Fuzzy Support Vector Machine

(FSVM). SVM has been demonstrated to be inferior to the Fuzzy Support Vector

Machine in terms of classification. SVM and FSVM are application-specific learn-

ing algorithms. The author has demonstrated that FSVM classification for valence,

arousal, and dominance is not uniform.

2.10 Modern Classification Techniques

Normally, CNN is used to predict emotions based on facial expressions. Deep neu-

ral networks and Convolution Neural Networks (Santamaria-Granados et al., 2018)

have been used to improve accuracy. Emotion recognition has been described using

deep neural networks (DNN) and convolutional neural networks (CNN) (Tripathi

et al., 2017). The authors used a large neural network for DNN, with hidden layers

using ReLu as an activation function and softmax activation functions for the final

output layer. For the hidden and output layers, dropout probabilities of 0.25 and

0.5 are applied, respectively. The input data has been represented as a 2D picture

with 100 initial convolution filters and a (3×3) convolution kernel when dealing

with CNN. convolution kernel when dealing with CNN. For valence and arousal, the

stated levels of accuracy are 66.79% percent and 57.58% percent, respectively.

Kang et al. (Kang et al., 2019) have applied Independent component analysis

to remove noises from artefacts and other nearby channels. They also use the mu-

tual information method to remove redundant channels and finally time-frequency

45
Literature Survey

feature is extracted using a spectrogram. The feature extraction is done by Convo-

lution Neural Network (CNN) with a convolution kernel.

The Bimodal Deep Auto Encoder (BADE) has been developed recently (Liu

et al., 2016c). A bimodal Deep Auto Encoder (BADE) was designed to accom-

modate EEG and Eye data for feature extraction from the SEED and DEAP data

sets. Hidden layers from two separate modes are integrated using the Restricted

Boltzmann Machine (RBM). A back-propagation technique is used to fine-tune the

weights of the two modes. In addition, linear SVM was employed to reach an accu-

racy of 83.25 percent in classification.

Xu et al., (Xu and Plataniotis, 2016) used Deep Belief Network (DBN) to in-

vestigate emotion recognition. They used ANOVA with SVM-RBF, SADE with

Soft-max activation function, and DBM with Soft-max classifier to predict emotion.

In their research paper, they reported high F1 scores for the prediction of arousal,

valence, and liking.

(Alhagry et al., 2017) ) have applied the Recurrent Neural Network (RNN) to

detect emotion. The authors used two LSTM layers in their model: the dropout

layer and the dense layer. The dense layer is performed for classification and the

LSTM is used for feature extraction. For arousal, valence, and liking, high accuracy

of 85.65 percent, 85.45 percent, and 87.99 percent has been recorded.

For the prediction of emotion, Chen J et al. (Chen et al., 2016) used a three-stage

46
Literature Survey

decision process. The authors divided the subjects into a few groups using the k-

Nearest Neighbour (k-NN) method to group comparable subjects. Random forest

is used to classify the data, and an accuracy of 70.04 percent has been reported.

2.11 Chapter Summary

In this investigation, several signal decomposition strategies were researched (Alar-

cao and Fonseca, 2017). For EEG spectrum decomposition, the Wavelet transform

(WT) and Empirical Mode Decomposition (EMD) are the most often used methods.

The wavelet transform’s performance is determined by the wavelet function and scal-

ing function. Researchers in this field use a variety of standard WT basis function

families, including Symlets (sym), Haar (db1), Coiflets (coif), and Daubechies (db).

SVM is a good classifier, but it only works if the parameters are correctly selected

and estimated. For higher accuracy, certain hyperparameters must be fine-tuned

during training (Anguita et al., 2012). The learning phase of an SVM classifier is

computationally demanding and complex. Despite being a nonlinear classifier, the

accuracy of SVM is highly dependent on the kernel assumption. The accuracy of

a KNN algorithm is determined by the number of nearest neighbours assumed to

be right. Because the EEG data and manual assessment are not linearly connected,

the cross-correlation worsens.

The state-of-the-art approaches to stimulus for emotion, various datasets, nu-

merous features, several statistical tools for feature extractions, signal processing

47
Literature Survey

in feature extraction and classification, and ultimately emotion classification were

discussed in this chapter. In conclusion, despite substantial studies on feature extrac-

tion, signal processing, and classification, relatively few studies have been conclusive

or generic about their findings. Furthermore, the type of characteristics chosen ap-

pears to have a significant impact on classification and identification accuracy. There

is no universally acknowledged method for classifying emotion from EEG signals,

and accuracy varies greatly depending on the type of classifier used and the dataset

used. The underlying pattern recognition mechanism is frequently subject-specific

and influenced by factors such as time, space, context, race, and so on. Support

Vector Machines (SVM), Linear Discriminant Analysis (LDA), k-Nearest Neighbor

(KNN), and Empirical Mode Decomposition (EMD) with Hilbert Huang transform

are only a few of the most often utilised approaches for EEG emotion recognition.

48

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy