EJMTC1866511614549600
EJMTC1866511614549600
Abstract
The tremendous increase of Internet users and various social media
platforms provide a massive amount of data. Companies are seeking
an automated method to assess their customers' satisfaction with their
products. Collecting and analyzing opinions and customers' feedback
Keywords:
from social media rely on what so called sentiment classification. Several
Bi-directional long shortterm memory (BI- types of research are carried out to investigate opinions in English.
LSTM), deep learning, convolutional neural As the Arabic language analysis faces many numerous challenges
network (CNN). and problems. In our current research, two powerful hybrid deep
learning models (CNN-LSTM) and (CNN- BILSTM) are represented.
Bidirectional LSTMs are an expansion of conventional LSTMs that
can make substantial improvements in sequence classification tasks
Corresponding Author: and identify the most valuable features, CNN is applied. Various data
preparation processes are performed, and two regular deep learning
Hossam Elzayady, Department of Computer models (CNN, LSTM) are implemented to conduct a series of
Engineering, Military Technical College, experiments. Experimental results show that the two proposed models
Cairo, Egypt. Tel: 0020126828242, have superior performance compared to baselines deep learning models
Email: hossamelzaiade@gmail.com (CNN, LSTM).Furthermore, the (CNN-BI-LSTM) model exceeds the
hybrid (CNN-LSTM) model in terms of achieving highest efficiency.
I. INTRODUCTION
a significant role in classification[6]. As a consequence, most
Since the advent of the internet, Social media platforms approaches pursue suitable features to produce outstanding
have started to have remarkable attention as a significant performance[6,7]. The majority of machine learning
way of communication worldwide. In our present time, the algorithms use vectors of fixed-length features; documents
simplest way to stay in contact with each other can happen can be referred to as vectors of fixed-length features. Bag-
instantly through social media applications. It can also be of-words is believed to be one of the best techniques to
used to spread different opinions about what is happening represent each text, primarily due to its efficacy and clarity,
all over the globe. As a result of social media's vast usage, however, without concern to word order[5]. This kind of
it generates a massive amount of posts, criticisms, and method can lead to misconception in the categorization of
reviews. Establishments and organizations are also keen sentiments, primarily given the possibility of referring to
to learn about their clients' opinions on social media different suggestions in same-word phrases[8]. The N-gram
platforms; this is one example of the prominent term is used to represent a phrase in another popular form. This
called "sentiment analysis"[1]. The majority of sentiment way is deemed more superior than the others[5,9]. The input
analysis work is focused on English data, leaving only of the classifier can then be interpreted by a specified
a limited amount of research that handled Arabic data. representation of the input sentence. Currently, it could be
It is believed that this is due to Arabic sources used in assumed that n-gram models take into account the word
processing sentiments and feedbacks being rare[2]. Arabic ordering in simple phrases, and it already encounters the
is not as simple as English in its formation of lexical trouble of data sparsity[7]. Deep learning is preferable
combinations making sentiment analysis a more difficult rather than machine learning as "Feature Engineering"
challenge. The base of most Arabic words consists of is unnecessary for deep learning. Deep learning features
three main letters, and we can build on them to form many can be retrieved using a completely automated process
words with different meanings[3,4]. However, this does and with no human expert involvement. Deep learning
not affect the massive increase of Arabic posts in various allows the analysis of multiple layers by turning complex
aspects on a daily basis, making it essential to run Arabic problems into simpler ones to support feature extraction
analysis[5]. Machine learning in sentiment analysis relies process capabilities. It also shows the high-level features
heavily on feature engineering[3,5]. As a result, features play classification needed in various activities[5]. Deep learning
has made significant progress in diverse aspects of Artificial the Support Vector Machines (SVM) classifier for Bag of
Intelligence (AI). Speech detection and recognition, Words (BOW) representation had the highest sentiment
image captioning, and natural language tasks are highly classification accuracy. In[5], to classify customers' attitudes
based on AI[5,8]. In this research, two significant deep at a particular time, a deep learning methodology is
learning models (CNN, Bidirectional LSTM) are utilized adopted on various Arabic datasets domains. The findings
on customer feedback in text categorization of Arabic to indicate that deep learning efficiency is distinguished from
create a sentiment methodological approach. Taking into the classic approach to machine learning. DNN achieved
account the ambiguity of the Arabic language, multiple an average value of accuracy 90.22%, precision 90.56%,
data preparation steps are carried out. Specific hyper- recall 90.90%, and F-measure of 90.68%, compared to
parameters are used for high precision, as well as some other common algorithms of machine learning Naïve
training phases are clearly stated. Outcomes have shown Bayes, Decision Tree, and K-Nearest. The researchers
that combined (CNN, bi-directional LSTM) achieves in[14] developed an efficient integrated CNN and LSTM
remarkable precision compared to other models. model for the English classification of text. The model
The remainder of this paper is split into several parts. has a strong potential to be far more precise than the
Section 2 defines the relevant works, while baselines deep fundamental models (CNN, LSTM).The authors in[24] offer
learning models used in our proposed method are discussed an integrated (CNN-BiLSTM) model applied to articles in
in Section 3. Section 4 illustrates our suggested combined French newspapers. They also used Word2vec/Doc2vec
models. The experimental outcomes and analysis are embedding. The suggested model was compared to five
shown in Section 5. Finally, in Section 6, the conclusion deep learning models. According to the findings, combined
and roadmap for future work are presented. (CNN-BiLSTM) achieves the highest accuracy of 90.66 %.
The authors in[22] demonstrate that a deep learning model
II. Related Work can outperform a classic machine learning model. On two
A large number of research have been conducted on Arabic datasets, the combination (CNN-LSTM) achieved
sentiment analysis. This is largely caused by the gradual the highest accuracy of (85.38 %, 86.88 %).
expansion of the data of individuals in Social Networks III. Baseline Models
sharing their thoughts, considerations, perspectives,
remarks, and everyday life[10]. Various forms of sentiment III.1 Convolution neural networks (CNN)
classifications, applications, methods, and approaches are CNNs are a popular type of deep learning structures
covered in depth[8]. In 2012, addressing CNNs in the issue that is used mainly in the categorization of images, it has
relating to picture classification became well known and also been used to classify texts recently.Figure1 describes
achieved better execution over various methodologies[11]. In the layout of CNN network structure which is used in text
the sentence classification model, CNNs for NLP showed categorization. Each sentence is represented as a matrix
unique results[12]. RNN is a major fundamental model of and each row of matrix represent a word[15]. To make sure
deep learning, Mikolov et al. presented in 2010 a method that all matrix rows are similar in length, the padding
of performing RNN model on speech recognition[13]. They method is followed. CNNs Network includes many
show that the n-gram method is dominated by RNN. Using processes; first convolution process uses several filters to
the prior state to figure its existing expression, RNN has extract the most substantial features. Then these extracted
different aspects in linguistic layout, which identifies the features are passed on second process which is pooling.
particular approach in nearly standard languages. The The prevalent public technique used in pooling is max
researchers suggested an effective method for sentiment pooling; this layer aims to record the maximum value from
analysis of Arabic languages based on tweet platform in[3], each feature map[15,16]. Then, the pooled features will be
which focused on lexical normalization of the original tweet combined as vector by using a fully connected layer, which
language. To assess the polarity of each tweet's sentiment, uses softmax function to get probable value of each class.
15
ESMT, Elzayady et al., 2021
III.2 Bi‑directional long‑short term memory neural states combined, preserving information from both past
networks and future is possible. The gates are computed as:
Bidirectional LSTM is recognized as an evolved
architecture from traditional LSTM that can be used to
boost sequence classification problems[15]. The architecture (1)
of LSTM is illustrated in Figure 2, there are three crucial
parts of LSTMs cell, an input gate, a forget gate, and
(2)
an output gate. The prime function of input gates is to
dominate the new input to the memory. Forget gate is taking
charge of preserving values for a particular period of time (3)
in memory. Finally, the output gate controls the amount
of memory storage required to activate the block[1, 17].
Bidirectional LSTM design permits networks at each time
step to gather both forward and backward information about (4)
the sequence. The ability to train input via two different
methods differs Bidirectional LSTM from conventional
LSTM. The first method preserves information from past (5)
to future and second preserve information from future to
past. There is a potential to preserve information from the Each gate's weight matrix is represented by U and W,
future and at any point in time, utilizing the two hidden while bias is given by b. σ and tanh are activation functions
IV. Proposed models as inputs. Its output would then be pooled down to more
IV.1 CNN-LSTM Model lightweight dimensions and inserted into an LSTM
layer. The potential of LSTMs to collect sequential data
The design of the suggested (CNN-LSTM) model is when considering previous data is one of their strengths.
displayed in Figure 3, which is comprised of two major The output vectors from the dropout layer are used as
parts: (CNN) and (LSTM). The two subsections illustrate inputs in this layer. Until being moved to a fully connected
how CNN can be used to retrieve higher-level word feature layer, the LSTM outputs are merged and arranged
sequences and LSTM to catch long-term correlations across in a single matrix. The array is converted into
window feature sequences, respectively. The integration a single output in the 0 to 1 range by the fully
starts with a convolution layer that takes word embedding’s connected layer.
16
Engineering Science and Military Technologies
Volume (5) - Issue (1) - Mar 2021
IV.2 CNN-BI-LSTM MODEL token to vector and pass it through the convolution layer,
Figure 4 demonstrates the CNN-BILSTM model's which applies some filters to upgrade and reduce the data
layout. The proposed model is viewed as an improvement size. Moreover, a layer of max-pooling is added after each
of CNN-LSTM where each LSTM cell is reinforced by two filter. Consequently, the outcomes of max-pooling layers
sets of hidden and cell states, one for a forward sequence are combined to build the input of BILSTM. The results
and the other for a backward sequence. Our proposed model of this stage are the input of a complete fully connected
exploits the main features of both LSTM and CNN. In fact, layer, which links each piece of input information with a
LSTM could accommodate long-term dependencies and piece of output information. Eventually, soft max function
overcome the key issues with vanishing gradients. For is used as an activation function to assign classes to each
this reason, LSTM is used when longer sequences are sentence to obtain the required output. Penultimate layer
used as inputs. On the other hand, CNN appears able to dropout is used to eliminate co-dependencies and reduce
understand local patterns and position-invariant features overfitting and regularization just as we did in training,
of a text. The proposed architecture incorporates different by setting activation to 0 for a random proportion p of the
layers. Initially, the embedding layer can convert the input hidden units.
V. Experiments
positive class and negative class is taken into account.
V.1. Dataset The dataset includes 8224 positive instances and 8225
The LABR dataset has 16449 categorized reviews; negative instances. The book reviews were accumulated
in each row, 1 stands for positive review while 0 stands by[18, 19] from a general public driven[20]. Figure 5 shows a
for negative review. The distribution equality between screenshot from a dataset.
17
ESMT, Elzayady et al., 2021
V.2. Rmsprop optimizer network computation engines. The results are determined
in accordance with the accuracy, precision, recall, and
Rmsprop stands for Root Mean Square Propagation;
F1-score values.
the key concept of this particular optimizer is utilizing
different learning rates for each weight were specifying Accuracy (8)
the exact learning rate across all weights is proven to be
ineffective. RMSProp divides a gradient by a running
average of its recent magnitude[21, 22]. First, the following (9)
equation determines the summation: Precision =
(6)
Recall (10)
Where st refers to the summation of wi and gi indicates
the gradient of wi and γ is moving average term while the
value of γ is 0.9. Therefore, when gradient is large, it will
be reduced. The following equation is used to measure the F1-score (11)
modified rule:
18
Engineering Science and Military Technologies
Volume (5) - Issue (1) - Mar 2021
In Table 1, all parameters which are used in binary cross entropy loss function between both
the training stage are shown. Model validation the softmax layer's outputs and their matching
accuracy is observed for each epoch. The labels is reduced.
embeding dimensions 32
epochs 10
The size of batch 64
filter 64
convolution function relu
kernel 3
pool size 2
dropout ratio 0.5
loss function binary_crossentropy
optimizer used rmsprop
LSTM state dimension 200
word embeddings not pre-trained
The average dataset validation accuracy of 10 epochs LSTM and hybrid (CNN-LSTM) models. However,
for our models is shown in Figure 7.CNN-BILSTM the average accuracy of each model after 10 epochs is
is performing extremely well. Our combined model presented in Table.1, following 5 tests that have been
achieves 92.4 percent validity precision, beating CNN, conducted.
The effectiveness of our suggested research which reach accuracy of 84.6%, 85.3%, 86.6 %,
methodology to gain the benefit of both CNN in identifying respectively.Our proposed model (CNN-BILSTM) is
local patterns and the capability of BI-LSTMS to leverage evaluated against another model (CNN-LSTM) in [21]. It
long-term dependencies is shown in Table.2.The results is noticeable that by adding BILSTM instead of standard
demonstrate CNN-BILSTM model reaches an accuracy LSTM recurrent layer, performance achieves higher
of 87.8 excel than both CNN, LSTM and (CNN-LSTM) accuracy with 1%.
19
ESMT, Elzayady et al., 2021
VI. Conclusion algorithm and application: A surveys. Ain Shames Engineering Journal,
5(4), 1093-1113.
This study proposes using integrated both (Bidirectional [9] Alomarai, K. M., ElSherife, H. M., and Shaalean, K. (2017, June).
Arabic Tweet Sentiments Analysis Using Machine Learning. In
LSTM and CNN) models to examine sentiment in Arabic International Conference on Industrial, Engineering and Other Application
text reviews. In the beginning, reviews are presented by a of Applied Intelligent Systemes (pp. 602-610). Springer, Cham.
word vector, then CNN is utilized to get the most relevant [10] Paang, B., and Laee, L. (2008). Opinion mining and sentiment
analysis. Foundations and Trends® in
features. The BILSTM main objective to gain the context Information Retrieval, 2(1–2), 1-135.
information of the text. Finally, for model enhancement, [11] Krizhevesky, A., Sutskeveers, I., and Hintoan, G. E. (2012). Imagenet
parameters are tuned. The experimental outcomes confirm classifications with deep
convolutional neurals networks. In Advances in neural information
the achievability and effectiveness of our proposed model. processing systemes (pp. 1097-1105).
The proposed model can be improved for future objectives [12] Kalchbrener, N., Grefenstetse, E., and Blunsomns, P. (2014). A
for Arabic categorization, using attention mechanism. We convolutional neurale networks for modelling sentences. arXiv preprint
arXiv:1404.2188.
also assume that precision can be increased by using word [13] Mikolovv, T., Kaarafiát, M., Burgett, L., Černoocký, J., and
embedding’s such as ELMO and Fast text embedding. Khudaenpuar, S. (2010). Recurrent neural network based language
models. In Eleventh Annual Conference of the International Speech
Communications Association.
VII. Acknowledgments [14] Soa, P. M. (2017). Twitter sentiments analysis using combined
LSTM-CNN models. Eprint Arxiv,1-9.
The authors would like to thank Prof. Dr. Mohamed [15] Roshanfeker, B., Khadvi, S., and Rahmati, M. (2017, May). Sentiment
Elshafey and Dr. Ashraf Abosekeen, From Electrical analysis using deep learnings on Persians text. In Electrical Engineering
engineering branch, Military technical college for their (ICEE), 2017 Iranian Conferencess on (pp. 1503-1508). IEEE.
[16] Senthill Kumrrr, N. K., and Malarvizhi, N. (2020). Bi-directional
tremendous help and support. LSTMs–CNNs combined method for sentiments analysis in part of
speech tagging (PoS). International Journal of Speech Technology, 23,
VII. References 373-380.
[17] Lii, D., and Qiann, J. (2016, October). Text sentiment analysis based
[1] Vaateekul, P., and Komsubha, T. (2016, July). A study of sentimen on long short-term memory. In 2016 First IEEE International Conference
anlysis using deep learning technique on Thai Twitter data. In Computer on Computers Communications and the Internet (ICCCI) (pp. 471-475).
Scienc and Softwares Engineering (JCSSE), 2016 13th Internationl Joint IEEE.
Conference on (pp. 1-6). IEEE. [18] Altowayaan, A. A., and Taoo, L. (2016, Decembr). Word
[2] Hamad, M., and Al-wadiy, M. (2016). Sentiment analysis for arabic embeddings for Arabic sentiments analysis. In Big Data (Big Data), 2016
review in social network using machine learning. In Informations IEEE International Conference on (pp. 3820-3825). IEEE.
Technology: New Generations(pp. 131-139). Springer, Cham. [19] Ali, M., and Atiiya, A. (2013, August). Labr: A large scale arabic
[3] Alwakd, G., Osmaan, T., and Hughees-Roberrts, T. (2017). Challenge book review datasets. In Proceedings of the 51st Annual Meeting of the
in Sentiment Analysis for Arabic Social Network. Procedia Computers Association for Computational Linguistic (Volume 2: Short Papers) (pp.
Science, 117, 89-100. 494-498).
[4] Altowayaan, A. A., andTaao, L. (2016, December). Word embedding [20] http://www.goodreads.com
for Arabic sentiments analysis. In Big Data (Big-Data), 2016 IEEE [21] Bakthaa, K., and Tripathhy, B. K. (2017, April). Investigations
International Conference on (pp. 3820-3825). IEEE. of recurrent neural network in the field of sentiments analysis. In
[5] Abdelhadee, N., Solimann, T. H. A., and Ibrahim, H. M. (2017, Communication and Signal Processing (ICCSP), 2017 International
September). Detecting Twitters Users’ Opinion of Arabic Comment Conference on (pp. 2047-2050). IEEE.
During Various Time Episodes via Deep Neurals Networks. In [22] Zayady, H., Badran, K. M., and Salama, G. I. (2020). Arabic
International Conferences on Advanced Intelligent Systems and Opinions Mining Using Combined CNN-LSTM Model. International
Informatic (pp. 232-246). Springer, Cham. Journal of Intelligent System and Application, 12(4).
[6] Taanag, D., Weei, F., Yangee, N., Zhou, M., Liu, T., and Qin, B. [23] Zebin, T., Sperrin, M., Peek, N., and Casson, A. J. (2018, July).
(2014). Learning sentiments-specific word embeddings for twitter Human activity recognition from inertial sensor time-series using batch
sentiments classifications. In Proceedinges of the 52nd Annual Meeting normalized deep LSTM recurrent networks. In 2018 40th Annual
of the Associations for Computational Linguistic (Volume 1: Long International Conference of the IEEE Engineering in Medicine and
Papers) (Vol. 1, pp. 1555-1565). Biology Society (EMBC) (pp. 1-4). IEEE.
[7] Lee, Q., andMikolovee, T. (2014, January). Distributed representation [24] Rhanoui, M., Mikram, M., Yousfi, S., and Barzali,
of sentence and documents. In International Conference on Machine S. (2019). A CNN-BiLSTM model for document-level
Learning (pp. 1188-1196). sentiment analysis. Machine Learning and Knowledge Extraction,
[8] Medhaet, W., Hasan, A., and Korashy, H. (2014). Sentiment analysis 1(3), 832-847.
20