0% found this document useful (0 votes)
8 views6 pages

ganesh2020

This paper presents a hyperparameter tuned bi-directional long short term memory (Bi-LSTM) model, named DE-BiLSTM, for emotion intensity sentiment classification using Twitter data. The model utilizes differential evolution for tuning hyperparameters and achieves high performance metrics, including an average precision of 93.83% and accuracy of 96.42%. The study emphasizes the importance of sentiment analysis in understanding public opinion through social media comments.

Uploaded by

Tai Ngo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views6 pages

ganesh2020

This paper presents a hyperparameter tuned bi-directional long short term memory (Bi-LSTM) model, named DE-BiLSTM, for emotion intensity sentiment classification using Twitter data. The model utilizes differential evolution for tuning hyperparameters and achieves high performance metrics, including an average precision of 93.83% and accuracy of 96.42%. The study emphasizes the importance of sentiment analysis in understanding public opinion through social media comments.

Uploaded by

Tai Ngo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Parameter Tuned Bi-Directional Long Short Term

Memory Based Emotion With Intensity Sentiment


Classification Model Using Twitter Data
V. Ganesh Dr. M. Kamarasan
Research Scholar Assistant professor
Department of Computer and Information Science Department of computer and Information Science
Annamalai University Annamalai University
dinesh59@gmail.com smkrasan@yahoo.com

Abstract— With the advanced growth in the Internet and (NLP), Machine Learning (ML), Data Mining (DM),
social networking technologies, a massive amount of information retrieval (IR) and some other studies [1].
comments are being generated on the Web at each and
every second. In the age of big data, mining the sentiments SA majorly concentrates on the orientation investigation of
and emotion classification using artificial intelligence (AI) comment corpus implies that peoples show the positive,
techniques gained significant interest to properly negative, or neutral emotions of the products or actions [2-4].
understand the opinion of the public. This paper presents Such comments portrayed the views of Internet users
a new hyperparameter tuned bi-directional long short regarding the goods, current news, and so on. Vendors should
term memory (Bi-LSTM) using differential evolution (DE) be capable of satisfying customers' needs with the required
model called DE-BiLSTM for the emotion with intensity products. Users can judge the products by reading the
based sentiment classification in Twitter data. The comments via the Internet. Here, SA is defined as the text
hyperparameters of Bi-LSTM namely batch size and classification task and used to differentiate the words applied
number of hidden layers are determined by means of in comments. In SA classification, learning a low dimension
differential evolution algorithm (DE). The proposed data and non-sparse word vector representation are the
method initially undergoes preprocessing is several stages significant steps [5]. The extensively applied word
like word to feature vector conversion and feature representation is a distributed word vector attained using the
extraction process takes place. Finally, a softmax based Word2vec method [6]. Therefore, shared word vectors do not
classification process is carried out to identify the different comprise with sentiment data regarding the words. Here, the
intensities with diverse classes that exist in the tweets. The involvement of the word sentiment data for text emotion
validation of the DE-BiLSTM model is carried out against classification is incorporated to the conventional term
SEMEVAL2018 Task-1Emotion Intensity Ordinal frequency–inverse document frequency (TF-IDF) technique
Classification dataset. The simulation results indicated and the weighted word vector is produced.
that the DE-BiLSTM model has outperformed the other
methods with an average precision of 93.83%, recall of Different ML methods have been presented for
90.41%, F-measure of 91.76% and accuracy of 96.42%. conventional emotion classification as well as multi-label
Keywords— Deep learning, Emotion classification, emotion classification. The supervised classifiers undergo
Sentiment Analysis, Parameter Tuning training on the collection of annotated corpora with the
application of a diverse set of handcrafted features. The
I. INTRODUCTION efficiency of these approaches is based on 2 major factors like
Recently, the progressive growth of the Internet and social a maximum number of labeled data and the group of features,
media enables a massive number of users to upload their own which differentiate among the samples. Under the application
suggestions and comments on the web pages and sites. Hence, of this scheme, many works have concentrated on the
the big data of people's reviews are produced via the Internet. collection of productive features to get the best classification
For instance, the product reviews are produced on E- result [7-9]. The main aim of this approach is to identify the
commerce websites like Amazon and Flipkart, and residency group of instructive features to mimic the emotions
comments are originated on traveling based websites namely, represented in the text. Bag-of-Words (BoW) and variation, n-
trivago and MakeMyTrip. Using the exponential improvement grams, are the representation techniques applied in text
of comments or reviews, a manual examination of the classifying issues as well as emotion detection.
comments is very complex. In the period of big data, mining
the sentiment of comments using the artificial intelligence Various models are integrated with BoW features with
(AI) method is highly suitable to understand the public alternate features like parts of speech tags (PoS), the
decision. The study of sentiment analysis (SA) is helpful to get emotional data obtained from the lexicon, statistical details,
the emotional movements of the comments. The SA is a type and word shapes that highlight the text representation. Even
of text classification, including natural language processing though BoW is a well-known model in the text classification

       

Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.
process, it is also filled with some limitations. Initially, it
removes the word order in which the 2 documents might have
similar representation and identical words, though it has
different semantics. The n-gram approach is used to overcome
the constraints of BoW by assuming the order of the words as
a context of length n. Here, sentiment and emotion lexicons
are mandatory objectives in deploying the productive
sentiment and SA modules. Therefore, it is complex to
develop these lexicons. Furthermore, identifying the optimal
integration of lexicons followed by the better statistical
features is a time-consuming operation.

In recent times, deep learning (DL) methods are applied for


developing end-to-end systems in major operations like speech
analysis, text categorization, as well as image classification. It
has been depicted that, these modules are used to extract the
high-level features from actual data in an automated manner
[10, 11]. Baziotis et al. [12] deployed the multilabel sentiment
classification operation with a deep attention model. The
SemEval leader board provided training for word-level
bidirectional LSTM and added in non-FL features when it is
ensemble [13].

This paper introduces an efficient hyperparameter tuned bi- Fig. 1. Work flow of DE-BiLSTM Model
directional long short term memory (Bi-LSTM) using
differential evolution (DE) model called DE-BiLSTM for the A. Preprocessing
classification of several intensities of Twitter data like Fear, In order to transform original tweets which are applicable
Anger, Joy, and Sadness. The parameters of Bi-LSTM namely for classification, a sequence of pre-processing steps is carried
batch size and number of hidden layers has been tuned using out. At the initial stage, the emoji's in the tweets are modified
DE. The proposed method initially undergoes preprocessing into respective Unicode and transformed into the lexicon. For
namely emoji transformation, word to vector conversion. illustration, the emoji ᅬ is converted to the respective
Then, the feature vectors are generated using the DE-BiLSTM Unicode of U+1F628 and further converted into the lexicon of
model. Finally, a softmax classifier is applied for the “Fearful”. Followed by, tags and hashtags are eliminated using
classification of different levels of exists exist in the applied stop word elimination. As the stop words are not useful in
Twitter data. The validation of the DE-BiLSTM model is tweets, it is removed from the tweet. Similarly, the special
carried out against the SEMEVAL2018 Task-1Emotion symbols and numerals are avoided to reject the confusion of
Intensity Ordinal Classification dataset. the vector generation task. Finally, word to vector
transformation process takes place.
The planning of the study is structured as follows. Section
2 elaborates the presented DE-BiLSTM technique and the
experimental validation part takes place in section 3. At last, B. Bi-LSTM Model
conclusions were given in section 4. LSTM is reliable with a common RNN approach; however,
it applies diverse methods to estimate the hidden state that
II. THE DE-BILSTM MODEL resolves the issue of RNN and is not capable to deal with
The overall working principle of the proposed DE- long-distance dependency. The LSTM approach is comprised
BiLSTM model is illustrated in Fig. 1. As shown in the figure, of a sequence of same memory units with 3 gates. With the
the DE-BiLSTM model involves preprocessing, feature application of text feature vector S and t word is an instance,
extraction, and classification. the values of corresponding states of LSTM unit of t word is
given as follows. The specialized estimation functions are
given in the following, where σ is a sigmoid function, ⊙
implies the dot multiplication [14]. The f defines a forget
gate:
f = σ(W w + U h + b ) (1)
The i refers the input gate:
i = σ(W w + U h + b ) (2)

       

Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.
The c means the candidate memory cell state at a recent of emotion classification. The DE model was coined by Storn,
time step, where tanh denotes the tangent hyperbolic for the optimization of real parameters and real value
function; functions. This model, that is a population relied technique,
c = tanh(W w + U h + b ) (3) which is extensively applied for frequent searching issues. In
recent times, the efficiency of this method has implied diverse
The c refers the state value: the value of f and i is from fields like strategy application for global numerical
(0, 1). The estimation of i ⊙ c points that novel data is optimization and FS for emotion analysis. Similar to Genetic
recorded in c which is acquired from candidate unit c . The Algorithm (GA), DE applies the crossover and mutation
computation of f ⊙ c means that data has been maintained models, unfortunately with explicit upgrading function. The
and left in the existing memories c . optimization task in DE is comprised of 4 steps, namely
Initialization, Mutation, Crossover and Selection. The last 3
c = i ⊙c +f ⊙c (4) phases are repeated until the termination criteria have been
The o is output gate: satisfied.

o = σ(W w + U h +b ) (5) D. Softmax Classifier


It provides the resulting vector v directly to a softmax
h is the hidden layer state at time t
layer for SA detection. The analyzing outcome is given in the
h = 0 ⊙ tanh(c ) (6) following:
y= (wv + b) (10)
The LSTM is comprised of periodical data of the sequence, The main goal of cross-entropy was established to estimate
which is inadequate. When this information can be accessed in method that imitates the gap among the actual emotion classes
the future, it may be very helpful in a sequence of operations. y and examined emotional classes y.
The bidirectional LSTM is composed of forward and
backward LSTM layers. The strategy of this method is given loss = − y log y (11)
in the following: the forward layer holds the existing data
regarding the sequence while the backward layer contains the where i implies the index value of a sentence.
new data of the sequence. These layers were linked to a III. PERFORMANCE VALIDATION
similar output layer. Also, the major point of this model is that
the sequence content information has been employed totally. To make sure the active results of the DE-BiLSTM
technique, a simulation is carried out in Python Programming
Assume the input of time t is word embedding w , at time language. The applied dataset and attained simulation outcome
are realized in the following subsections. To ensure the
- 1, so that the result of forward hidden unit is h⃗, and
outcome of projected DE-BiLSTM approach, a benchmark
simulation outcome of the backward hidden unit is h ⃖ ,
SEMEVAL2018 Task-1Emotion Intensity Ordinal
Finally, the attained results of backward as well as a hidden Classification dataset is utilized [15]. It contains a set of 4042
unit at time t is the same as given below: tweets with 4 intestines like joy, fear, anger, and sadness.
h⃗ = L w , h⃗ , c (7) From the given tweets, 1074 tweets come under the class of
Joy, 650 tweets belong to fear, 991 tweets fit into the class
⃖h = L w , ⃖h ,c (8) anger, whereas 555 tweets come under the class of sadness.
The dataset is portioned into training and testing data with a
where L (∙) is the hidden layer task of LSTM hidden layer. ratio of 75% and 25% respectively.
The forward output vector is h⃗ ∈ R × while the backward
output vector is h ⃖ ∈← R × , and these vectors have to be Table 1 provided the accomplished classification outcome
integrated to get the text feature. It is evident that H is the of affect dimension by applying the DE-BiLSTM model under
number of hidden layer cells: various measures. The table values implied that the instances
for anger class are productively classified with the precision
H = h⃗ ‖h⃖ ← (9) value of 99.89%, recall of 98.38%, F-measure of 99.13% and
accuracy of 99.47% correspondingly. At the same time, the
C. DE based Bi-LSTM Model
instances of the fear class are successfully portioned with the
The key objective is to optimize the hyper-parameters of precision of 99.38%, recall of 99.07%, F-measure of 99.22%
BiLSTM classification with the help of the DE model and and accuracy of 99.69% respectively. In line with this, the
reach the optimal classification result. Here, the parameters instances for joy class are effectively classified with the
applied are batch size and number of hidden neurons. The DE precision of 98.79%, recall of 99.53%, F-measure of 99.16%
method is initiated from the primary conclusion that is and accuracy of 99.44% correspondingly. Likewise, the
produced in a random manner and tries to boost the samples that exist in sadness class are highly classified with a
accurateness of the sentiment classification model until precision of 97.16%, recall of 98.73%, F-measure of 97.94%
meeting the termination condition. The fitness function (FF) is and accuracy of 99.29%.
the BiLSTM system that is in charge to estimate the accuracy

       

Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.
TABLE I
PERFORMANCE MEASURES OF AFFECT DIMENSION USING PROPOSED DE-BILSTM
Measures Precision Recall F-Measure Accuracy
Anger 99.89 98.38 99.13 99.47
Fear 99.38 99.07 99.22 99.69
Joy 98.79 99.53 99.16 99.44
Sadness 97.16 98.73 97.94 99.29
Average 98.81 98.93 98.86 99.47

Fig. 2. Accuracy Analysis of Training and Validation at the time of Model Creation

Fig. 3. Loss Graph of Training and Validation at the time of Model Creation

       

Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.
Meantime, the presented technique has reached a higher From the predefined experimental analysis, it is clear that
precision of 98.81%, recall of 98.93%, F-measure of 98.86% the presented DE-BiLSTM model perform well than previous
and accuracy of 99.47% respectively. approaches significantly. Hence, it is employed as the proper
tool used for emotion classification of content in social
Fig. 2 demonstrated the accuracy graph retrieved during networking sites like Twitter, Instagram, and so on.
the training and validation process of the model creation with
the diverse number of epochs. From the figure, it is clear that IV. CONCLUSION
the validation accuracy is moderate and continues to increase This paper has developed a parameter tuned BiLSTM
over the training accuracy. Besides, it is observed that using DE called DE-BiLSTM model for the classification of
accuracy gets enhanced with improved epochs. Fig. 3 defines multiple intensities of Twitter data. The proposed DE-
the loss graph achieved at the time of the training and BiLSTM has classified the twitter data into a set of emotions
validation process of model deployment under various epochs. namely fear, anger, joy and sadness. The validation of the DE-
The figure evidently shows that validation loss is gradually BiLSTM model is carried out against SEMEVAL2018 Task-1
reduced when compared to training loss. On the other hand, it Emotion Intensity Ordinal Classification dataset. The
is pointed out that the loss is improved with increased epochs simulation results indicated that the DE-BiLSTM model has
values. outperformed the other methods with an average precision of
93.83%, recall of 90.41%, F-measure of 91.76% and accuracy
of 96.42%. In the future, the proposed DE-BiLSTM model can
be extended to the use of hybridization of evolutionary
algorithms for parameter tuning.
REFERENCES

[1] L.Wang, D.Miao, and Z.Zhang, “Emotional Analysis on


Text Sentences Based on Topic,” Computer Science,
vol.41, no.3, pp.32- 35, 2014.
[2] S.Krishnamoorthy, “Sentiment analysis of financial news
articles using performance indicators,” Knowledge &
Information Systems, vol.56, no.2, pp.373-394, 2018.
[3] N.Shelke, S.Deshpande, and V.Thakare, “Domain
independent approach for aspect oriented sentiment
analysis for product reviews,” in Proceedings of the 5th
international conference on frontiers in intelligent
computing: Theory and applications, Singapore, 2017,
pp.651-659.
Fig. 4. Comparative accuracy analysis of DE-BiLSTM with [4] P.Sharma and N.Mishra, “Feature level sentiment analysis
existing models on movie reviews,” in 2016 2nd International Conference
on Next Generation Computing Technologies (NGCT).
IEEE, Dehradun, India, 2016, pp.306-311.
Fig. 4 have depicted the results obtained by the DE-
[5] Q.Zhang, S.Zhang, and Z.Lei, “Chinese sentiment
BiLSTM method with the newly presented techniques with classification based on improved convolutional neural
respect to accuracy. The figure clearly shows that the linear network,” Computer Engineering and Applications, vol.53,
SVC model provided impractical result with the least accuracy no.22, pp.116-120, 2017.
of 48.90%. Besides, it is obvious that GRU and Context [6] D.Zhang et al. , “Research of Chinese Comments
Aware models have offered minimum performance with the Sentiment Classification Based on Word2vec and
lower accuracy of 52.40% and 53.20% correspondingly. Also, SVMperf, ” Computer Science, vol.43, no.6A, pp.418-
the MNB and RF methods have depicted a slightly better and 421,447, 2016.
closer results by giving accuracy values of 54.70% and 54% [7] M. Jabreel and A. Moreno, “SentiRich: Sentiment Analysis
of Tweets Based on a Rich Set of Features”, In Artificial
correspondingly. In the same way, the models used by
Intelligence Research and Development, Vol. 288, pp. 137–
Mohameed Jabreel et al., Mondher Bouazizi et al., and Malak 146, 2016.
Abdullah et al. tends to attain moderate results with the [8] M. Jabreel and A. Moreno, “SiTAKA at SemEval-2017
accuracy of 59%, 60.20% and 59.90%. Additionally, the MLR Task 4: Sentiment Analysis in Twitter Based on a Rich Set
and CNN model has outperformed than earlier methods by of Features”, In Proceedings of the 11th International
providing maximum accuracy of 62.27% and 82.72%. Though Workshop on Semantic Evaluation (SemEval-2017),
the LSTM and Bi-LSTM models have accomplished an Vancouver, BC, Canada, 3–4 August 2017; pp. 694–699.
accuracy of 92.04% and 94.73%, the DE-BiLSTM model has [9] S. Mohammed, F. Bravo-Marquez, M. Salameh, and S.
reached to a maximum accuracy of 96.42%. Kiritchenko, “Semeval-2018 task 1: Affect in Tweets”, In
Proceedings of the 12th International Workshop on
Semantic Evaluation, New Orleans, LA, USA, 5–6 June
2018; pp. 1–17

       

Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.
[10] Y. LeCun, Y. Bengio, and G. Hinton, “Deep Learning”,
Nature, vol. 521, pp. 436–444, 2015.
[11] D. Tang, B. Qin, and T. Liu, Deep Learning for Sentiment
Analysis: Successful Approaches and Future Challenges.
Wiley Interdiscip. Rev. Data Min. Knowl. Discov. Vol. 5,
pp. 292–303, 2015.
[12] C. Baziotis, N. Athanasiou, A. Chronopoulou, and A.
Kolovou, “NTUA-SLP at SemEval-2018 Task 1:
Predicting Affective Content in Tweets with Deep
Attentive RNNs and Transfer Learning”, In Proceedings of
the 12th International Workshop on Semantic Evaluation,
New Orleans, LA, USA, 5–6 June 2018; pp. 245–255.
[13] H. Meisheri, and L. Dey, “TCS Research at Semeval2018
Task 1: Learning Robust Representations using Multi-
Attention Architecture”, In Proceedings of the 12th
International Workshop on Semantic Evaluation, New
Orleans, LA, USA, 5–6 June 2018; pp. 291–299.
[14] A. Graves, and J. Schmidhuber, Framewise phoneme
classification with bidirectional LSTM and other neural
network architectures. Neural networks, vol. 18, no. 5-6,
pp.602-610, 2005.
[15] https://competitions.codalab.org/competitions/17751

       

Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy