(IJCST-V11I1P4) :rogaia Yousif, Ashraf Abdalla
(IJCST-V11I1P4) :rogaia Yousif, Ashraf Abdalla
(IJCST-V11I1P4) :rogaia Yousif, Ashraf Abdalla
ABSTRACT
Sentiment analysis (SA) is a process of extensive exploration of data stored on the web to identify and categorize the views
expressed in a part of the text. In the sentiment classification, the accuracy or speed are difficulty to remove semantic ambiguity
in sentiment analysis from the text ,but there are many machine learning that have a lot of similarities in features. Therefore, in
this paper, we have expanded the sentiment into sentence analysis. We used deep learning model in natural language processing
to analyse the sentence present in a given English language text, we used data on social media platforms such as Twitter. After
working on processing the text and putting it in an understandable way for the classification model to find more efficient pre-
processing techniques and more accurate and fast way to analyze sentiment from text, we used deep learning models by a
recurrent neural network (RNN) with a state memory and multilayer cell structure Long Short-Term Memory (LSTM). We
experimented and evaluated the method using Recurrent Neural Networks and Long short-term memory on dataset consists of a
nearly 3000 Amazon customer reviews as input text , star ratings, date of review, variant and feedback of various amazon Alexa
products for learning how to train learning for sentiment analysis, our use of this data to analyze Amazon Alexa product and
API data; Discover insights into consumer reviews that are aided by deep learning models to achieve high emotion
classification accuracy. A thorough evaluation shows that the system gains emotion prediction on LSTM model with 92.10 %
accuracy for positive/negative classification with train time 25m for dataset and 93.18 % accuracy for positive and negative
with train time 10s for API data.
Keywords: - Natural Language Processing (NLP); Sentiment Analysis; Twitter Platform ;Deep Learning Classifiers; RNN;
LSTM.
We find that the product of both the TF and IDF matrices is the
normalized weights which is the TF-IDF output. In this way we Fig .2 LSTM model constructor
get the numerical input for the machine learning model. TF-IDF
is used to represent text with a BoW (Bag of Words). Test cases for positive and negative emotions are demonstrated
Converting the Reviews into Numerical Vectors (doc2vec). here with an example.
The doc2vec vector algorithm performs considerably well for Case 1: Positive:
sentence similarity tasks. However, if the input corpus includes For example: “omg!!! It is surprising,” it is categorized as
a lot of words with misspellings like tweets, this algorithm may positive. Further, it is categorized based on the percentage in
not be an ideal choice. We used Doc2Vec method which is used correspondence to each emotion. Here we get 40.91% for
for the vectorization of documents. This is an improved version surprise, 28.27% of relief, 11.64% for fun, 7.22% of neutral and
of Word2Vec. This method was not desirable in the works less than 5 percentage for another three emotions each, which
where a lot of misspellings of words occur. It is better to shows that the emotion surprise is more dominant in this
convert words to vectors and then use these vectors to create the comment. Hence it is categorized as a surprise.
vector format of the whole document. Doc2Vec is used to Case 2: Negative:
represent text with word vectors. For example: “I am so panic these days”, it is categorized as
Train-Test Split Date ,One of the golden rules in deep learning negative emotion statement. Further, it is categorized based on
is to split your dataset into train, validation, and test set. The the percentage of correspondence to each emotion. Here we get
reason we do that is very simple. If we would not split the data 78.43 percentage to worry, only 14.82% is for sadness, 5.49%
into different sets the model would be evaluated on the same of empty and other three emotions together obtain only 1.26%
data it has seen during training, Back before using deep of the total. So, it is categorized as worry.
ACKNOWLEDGMENT
REFERENCES
Fig .6 Results Of Predictions [1] D. N. Mati, M. Hamiti, A. Susuri, B. Selimi, and J.
Ajdari, "Building Dictionaries for Low Resource
Results of model is content of accuracy and loss of dataset for
Languages: Challenges of Unsupervised Learning,"
new API data from twitter and execute time. The final result of
the model prediction for API data are depicted in Table 2.