ganesh2020
ganesh2020
Abstract— With the advanced growth in the Internet and (NLP), Machine Learning (ML), Data Mining (DM),
social networking technologies, a massive amount of information retrieval (IR) and some other studies [1].
comments are being generated on the Web at each and
every second. In the age of big data, mining the sentiments SA majorly concentrates on the orientation investigation of
and emotion classification using artificial intelligence (AI) comment corpus implies that peoples show the positive,
techniques gained significant interest to properly negative, or neutral emotions of the products or actions [2-4].
understand the opinion of the public. This paper presents Such comments portrayed the views of Internet users
a new hyperparameter tuned bi-directional long short regarding the goods, current news, and so on. Vendors should
term memory (Bi-LSTM) using differential evolution (DE) be capable of satisfying customers' needs with the required
model called DE-BiLSTM for the emotion with intensity products. Users can judge the products by reading the
based sentiment classification in Twitter data. The comments via the Internet. Here, SA is defined as the text
hyperparameters of Bi-LSTM namely batch size and classification task and used to differentiate the words applied
number of hidden layers are determined by means of in comments. In SA classification, learning a low dimension
differential evolution algorithm (DE). The proposed data and non-sparse word vector representation are the
method initially undergoes preprocessing is several stages significant steps [5]. The extensively applied word
like word to feature vector conversion and feature representation is a distributed word vector attained using the
extraction process takes place. Finally, a softmax based Word2vec method [6]. Therefore, shared word vectors do not
classification process is carried out to identify the different comprise with sentiment data regarding the words. Here, the
intensities with diverse classes that exist in the tweets. The involvement of the word sentiment data for text emotion
validation of the DE-BiLSTM model is carried out against classification is incorporated to the conventional term
SEMEVAL2018 Task-1Emotion Intensity Ordinal frequency–inverse document frequency (TF-IDF) technique
Classification dataset. The simulation results indicated and the weighted word vector is produced.
that the DE-BiLSTM model has outperformed the other
methods with an average precision of 93.83%, recall of Different ML methods have been presented for
90.41%, F-measure of 91.76% and accuracy of 96.42%. conventional emotion classification as well as multi-label
Keywords— Deep learning, Emotion classification, emotion classification. The supervised classifiers undergo
Sentiment Analysis, Parameter Tuning training on the collection of annotated corpora with the
application of a diverse set of handcrafted features. The
I. INTRODUCTION efficiency of these approaches is based on 2 major factors like
Recently, the progressive growth of the Internet and social a maximum number of labeled data and the group of features,
media enables a massive number of users to upload their own which differentiate among the samples. Under the application
suggestions and comments on the web pages and sites. Hence, of this scheme, many works have concentrated on the
the big data of people's reviews are produced via the Internet. collection of productive features to get the best classification
For instance, the product reviews are produced on E- result [7-9]. The main aim of this approach is to identify the
commerce websites like Amazon and Flipkart, and residency group of instructive features to mimic the emotions
comments are originated on traveling based websites namely, represented in the text. Bag-of-Words (BoW) and variation, n-
trivago and MakeMyTrip. Using the exponential improvement grams, are the representation techniques applied in text
of comments or reviews, a manual examination of the classifying issues as well as emotion detection.
comments is very complex. In the period of big data, mining
the sentiment of comments using the artificial intelligence Various models are integrated with BoW features with
(AI) method is highly suitable to understand the public alternate features like parts of speech tags (PoS), the
decision. The study of sentiment analysis (SA) is helpful to get emotional data obtained from the lexicon, statistical details,
the emotional movements of the comments. The SA is a type and word shapes that highlight the text representation. Even
of text classification, including natural language processing though BoW is a well-known model in the text classification
Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.
process, it is also filled with some limitations. Initially, it
removes the word order in which the 2 documents might have
similar representation and identical words, though it has
different semantics. The n-gram approach is used to overcome
the constraints of BoW by assuming the order of the words as
a context of length n. Here, sentiment and emotion lexicons
are mandatory objectives in deploying the productive
sentiment and SA modules. Therefore, it is complex to
develop these lexicons. Furthermore, identifying the optimal
integration of lexicons followed by the better statistical
features is a time-consuming operation.
This paper introduces an efficient hyperparameter tuned bi- Fig. 1. Work flow of DE-BiLSTM Model
directional long short term memory (Bi-LSTM) using
differential evolution (DE) model called DE-BiLSTM for the A. Preprocessing
classification of several intensities of Twitter data like Fear, In order to transform original tweets which are applicable
Anger, Joy, and Sadness. The parameters of Bi-LSTM namely for classification, a sequence of pre-processing steps is carried
batch size and number of hidden layers has been tuned using out. At the initial stage, the emoji's in the tweets are modified
DE. The proposed method initially undergoes preprocessing into respective Unicode and transformed into the lexicon. For
namely emoji transformation, word to vector conversion. illustration, the emoji ᅬ is converted to the respective
Then, the feature vectors are generated using the DE-BiLSTM Unicode of U+1F628 and further converted into the lexicon of
model. Finally, a softmax classifier is applied for the “Fearful”. Followed by, tags and hashtags are eliminated using
classification of different levels of exists exist in the applied stop word elimination. As the stop words are not useful in
Twitter data. The validation of the DE-BiLSTM model is tweets, it is removed from the tweet. Similarly, the special
carried out against the SEMEVAL2018 Task-1Emotion symbols and numerals are avoided to reject the confusion of
Intensity Ordinal Classification dataset. the vector generation task. Finally, word to vector
transformation process takes place.
The planning of the study is structured as follows. Section
2 elaborates the presented DE-BiLSTM technique and the
experimental validation part takes place in section 3. At last, B. Bi-LSTM Model
conclusions were given in section 4. LSTM is reliable with a common RNN approach; however,
it applies diverse methods to estimate the hidden state that
II. THE DE-BILSTM MODEL resolves the issue of RNN and is not capable to deal with
The overall working principle of the proposed DE- long-distance dependency. The LSTM approach is comprised
BiLSTM model is illustrated in Fig. 1. As shown in the figure, of a sequence of same memory units with 3 gates. With the
the DE-BiLSTM model involves preprocessing, feature application of text feature vector S and t word is an instance,
extraction, and classification. the values of corresponding states of LSTM unit of t word is
given as follows. The specialized estimation functions are
given in the following, where σ is a sigmoid function, ⊙
implies the dot multiplication [14]. The f defines a forget
gate:
f = σ(W w + U h + b ) (1)
The i refers the input gate:
i = σ(W w + U h + b ) (2)
Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.
The c means the candidate memory cell state at a recent of emotion classification. The DE model was coined by Storn,
time step, where tanh denotes the tangent hyperbolic for the optimization of real parameters and real value
function; functions. This model, that is a population relied technique,
c = tanh(W w + U h + b ) (3) which is extensively applied for frequent searching issues. In
recent times, the efficiency of this method has implied diverse
The c refers the state value: the value of f and i is from fields like strategy application for global numerical
(0, 1). The estimation of i ⊙ c points that novel data is optimization and FS for emotion analysis. Similar to Genetic
recorded in c which is acquired from candidate unit c . The Algorithm (GA), DE applies the crossover and mutation
computation of f ⊙ c means that data has been maintained models, unfortunately with explicit upgrading function. The
and left in the existing memories c . optimization task in DE is comprised of 4 steps, namely
Initialization, Mutation, Crossover and Selection. The last 3
c = i ⊙c +f ⊙c (4) phases are repeated until the termination criteria have been
The o is output gate: satisfied.
Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.
TABLE I
PERFORMANCE MEASURES OF AFFECT DIMENSION USING PROPOSED DE-BILSTM
Measures Precision Recall F-Measure Accuracy
Anger 99.89 98.38 99.13 99.47
Fear 99.38 99.07 99.22 99.69
Joy 98.79 99.53 99.16 99.44
Sadness 97.16 98.73 97.94 99.29
Average 98.81 98.93 98.86 99.47
Fig. 2. Accuracy Analysis of Training and Validation at the time of Model Creation
Fig. 3. Loss Graph of Training and Validation at the time of Model Creation
Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.
Meantime, the presented technique has reached a higher From the predefined experimental analysis, it is clear that
precision of 98.81%, recall of 98.93%, F-measure of 98.86% the presented DE-BiLSTM model perform well than previous
and accuracy of 99.47% respectively. approaches significantly. Hence, it is employed as the proper
tool used for emotion classification of content in social
Fig. 2 demonstrated the accuracy graph retrieved during networking sites like Twitter, Instagram, and so on.
the training and validation process of the model creation with
the diverse number of epochs. From the figure, it is clear that IV. CONCLUSION
the validation accuracy is moderate and continues to increase This paper has developed a parameter tuned BiLSTM
over the training accuracy. Besides, it is observed that using DE called DE-BiLSTM model for the classification of
accuracy gets enhanced with improved epochs. Fig. 3 defines multiple intensities of Twitter data. The proposed DE-
the loss graph achieved at the time of the training and BiLSTM has classified the twitter data into a set of emotions
validation process of model deployment under various epochs. namely fear, anger, joy and sadness. The validation of the DE-
The figure evidently shows that validation loss is gradually BiLSTM model is carried out against SEMEVAL2018 Task-1
reduced when compared to training loss. On the other hand, it Emotion Intensity Ordinal Classification dataset. The
is pointed out that the loss is improved with increased epochs simulation results indicated that the DE-BiLSTM model has
values. outperformed the other methods with an average precision of
93.83%, recall of 90.41%, F-measure of 91.76% and accuracy
of 96.42%. In the future, the proposed DE-BiLSTM model can
be extended to the use of hybridization of evolutionary
algorithms for parameter tuning.
REFERENCES
Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.
[10] Y. LeCun, Y. Bengio, and G. Hinton, “Deep Learning”,
Nature, vol. 521, pp. 436–444, 2015.
[11] D. Tang, B. Qin, and T. Liu, Deep Learning for Sentiment
Analysis: Successful Approaches and Future Challenges.
Wiley Interdiscip. Rev. Data Min. Knowl. Discov. Vol. 5,
pp. 292–303, 2015.
[12] C. Baziotis, N. Athanasiou, A. Chronopoulou, and A.
Kolovou, “NTUA-SLP at SemEval-2018 Task 1:
Predicting Affective Content in Tweets with Deep
Attentive RNNs and Transfer Learning”, In Proceedings of
the 12th International Workshop on Semantic Evaluation,
New Orleans, LA, USA, 5–6 June 2018; pp. 245–255.
[13] H. Meisheri, and L. Dey, “TCS Research at Semeval2018
Task 1: Learning Robust Representations using Multi-
Attention Architecture”, In Proceedings of the 12th
International Workshop on Semantic Evaluation, New
Orleans, LA, USA, 5–6 June 2018; pp. 291–299.
[14] A. Graves, and J. Schmidhuber, Framewise phoneme
classification with bidirectional LSTM and other neural
network architectures. Neural networks, vol. 18, no. 5-6,
pp.602-610, 2005.
[15] https://competitions.codalab.org/competitions/17751
Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 22,2020 at 16:12:56 UTC from IEEE Xplore. Restrictions apply.